|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.switchboard.util.WordUtil
public final class WordUtil
A collection of methods to manipulate and analyze English text.
| Constructor Summary | |
|---|---|
WordUtil()
|
|
| Method Summary | |
|---|---|
static float |
calculateEnglishness(String s)
Calculates the percentage of words in the sentence that can be found in the dictionary. |
static float |
calculateEnglishness(String s,
Dictionary d)
Calculates the percentage of words in the sentence that can be found in the dictionary. |
static String |
capitalize(String word)
Capitalizes the first letter of the string |
static int |
commonStrings(String[] one,
String[] two)
Calculates the number of strings that two arrays share, ignoring case. |
static String |
convertHTMLEntities(String input)
Converts the HTML entities in a string to the ASCII characters. |
static String[] |
getAdjectives(String in)
Gets all of the adjectives in a given sentence. |
static String[] |
getAdjectives(String in,
Dictionary dict)
Gets all of the adjectives in a given sentence. |
static String[] |
getAdverbs(String in)
Gets all of the adverbs in a given sentence. |
static String[] |
getAdverbs(String in,
Dictionary dict)
Gets all of the adverbs in a given sentence. |
static String[] |
getNouns(String in)
Gets all of the nouns out of the specified string. |
static String[] |
getNouns(String in,
Dictionary dict)
Gets all of the nouns out of the specified string. |
static String[] |
getPronouns(String in)
Gets all of the pronouns in a given sentence. |
static String[] |
getPronouns(String in,
Dictionary dict)
Gets all of the pronouns in a given sentence. |
static String[] |
getSynonyms(String word,
int maxNum)
Gets the synonyms for the provided word. |
static String[] |
getSynonyms(String word,
int maxNum,
Thesaurus thes)
Gets the synonyms for the provided word. |
static String[] |
getTheseWords(String in,
String pos,
Dictionary dict)
Gets all words of a particular part of speech from a sentence. |
static String[] |
getVerbs(String in)
Gets all of the verbs out of the specified string. |
static String[] |
getVerbs(String in,
Dictionary dict)
Gets all of the verbs out of the specified string. |
static boolean |
isCapitalized(String s)
Returns true if the first letter of the string is a capital letter. |
static boolean |
isEnglishWord(String s,
Dictionary d)
Returns true if the word can be found in the dictionary. |
static boolean |
isFloat(String s)
Determines whether the the characters in a string are a valid floating point number. |
static boolean |
isInteger(String s)
Tells you if a string contains a number (floats alowed) |
static String |
lastFewWords(String sentence,
int num)
Returns the lat few words of a sentence. |
static String |
literal(String s)
Puts quotes around the string. |
static float |
match(String sentence,
String query)
Uses the Lucene text search to match a Lucene query to a sentence. |
static String |
sentenceMakePretty(String sentence)
Capitalizes the first word in the string, uncapitalizes the rest of the words, strips all tabs and newlines and superfluous spaces, and adds a period at the end. |
static int |
similarity(String word1,
String word2)
Counts the number of synonyms that the words share |
static int |
similarity(String word1,
String word2,
Thesaurus thes)
Counts the number of synonyms that the words share |
static String |
stem(String in)
Returns the stem of the provided word using the PorterStemmer |
static String |
stripHtml(String s)
Strips the HTML out of a string |
static String |
stripNonWords(String in)
Strips all words that don't match [A-Za-z0-9,\\.'\"’\\-]+ |
static String |
stripStopwords(String in)
Strips stopwords from the provided sentence. |
static String |
wordWrap(String str,
int n)
Inserts newlines after n characters, or at the last word before that. |
static String[] |
wordWrap(String str,
int n,
String[] lines)
Breaks up the string into lines n characters long |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public WordUtil()
| Method Detail |
|---|
public static String lastFewWords(String sentence,
int num)
sentence - The sentence to grab from.num - The number of words to get from the end of the sentence.
public static int similarity(String word1,
String word2)
word1 - The first word to look up and compareword2 - The second word to look up and compare
public static int similarity(String word1,
String word2,
Thesaurus thes)
word1 - The first word to look up and compareword2 - The second word to look up and comparethes - The thesaurus to use to do the comparison
public static String capitalize(String word)
word - Capitalizes the first letter of the provided string.
public static boolean isCapitalized(String s)
s - The string to check
public static int commonStrings(String[] one,
String[] two)
one - The first array of stringstwo - The second array of strings to check
public static String stripHtml(String s)
s - The HTML-filled string
public static String[] getSynonyms(String word,
int maxNum)
word - The word to look upmaxNum - The maximum number of synonyms to get
public static boolean isInteger(String s)
s - The String to test
public static boolean isFloat(String s)
s - The string to test
public static String convertHTMLEntities(String input)
input - The HTML Entity-filled String
public static String[] getSynonyms(String word,
int maxNum,
Thesaurus thes)
thes - The thesaurus to use to look up the wordword - The word to look upmaxNum - The maximum number of synonyms to get
public static String stripStopwords(String in)
in - The stopword-filled sentence.
Stopwordspublic static String literal(String s)
s - The unquoted String
public static String stripNonWords(String in)
in - The non-word-filled String
public static String stem(String in)
in - An English word
public static String sentenceMakePretty(String sentence)
sentence - the un-pretty sentence.
public static String[] getAdverbs(String in)
in - The full sentence
public static String[] getAdverbs(String in,
Dictionary dict)
dict - The dictionary to use to determine the part of speechin - The full sentence
public static String[] getPronouns(String in)
in - The full sentence
public static String[] getPronouns(String in,
Dictionary dict)
dict - The dictionary to use to determine the part of speechin - The full sentence
public static String[] getAdjectives(String in)
in - The full sentence
public static String[] getAdjectives(String in,
Dictionary dict)
dict - The dictionary to use to determine the part of speechin - The full sentence
public static String[] getNouns(String in)
in - The string to get the nouns from.
public static String[] getNouns(String in,
Dictionary dict)
in - The string from which to get the nouns.dict - The dictionary to use to do the checking.
public static String[] getVerbs(String in)
in - The string to get the verbs from.
public static String[] getVerbs(String in,
Dictionary dict)
in - The string from which to get the verbs.dict - The dictionary to use to do the checking.
public static String[] getTheseWords(String in,
String pos,
Dictionary dict)
in - The word for which to find the part of speechpos - The part of speech to look for.dict - The dictionary to use to find the part of speech.
public static float calculateEnglishness(String s)
s - The sentence to analyze
public static float calculateEnglishness(String s,
Dictionary d)
d - The dictionary to use to look up the words - The sentence to analyze
public static boolean isEnglishWord(String s,
Dictionary d)
s - The word to testd - The dictionary to use to look up the word
public static String wordWrap(String str,
int n)
str - The string to wrapn - The number of chars to wrap at.
public static String[] wordWrap(String str,
int n,
String[] lines)
str - The string to wrapn - The number of chars to wrap at.A - String array in which to put the broken lines
public static float match(String sentence,
String query)
throws ParseException
sentence - The sentence to searchquery - the Lucene query
org.apache.lucene.queryParser.ParseException - If the Lucene query is not valid
ParseException
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||