![]() |
![]()
![]() ![]() ![]()
![]() |
![]() |
All Packages Class Hierarchy This Package Previous Next Index Class java.text.RuleBasedCollatorjava.lang.Object | +----java.text.Collator | +----java.text.RuleBasedCollator
RuleBasedCollator class is a concrete subclass of
Collator that provides a simple, data-driven, table collator.
With this class you can create a customized table-based Collator .
RuleBasedCollator maps characters to sort keys.
The collation table is composed of a list of collation rules, where each rule is of three forms: < modifier > < relation > < text-argument > < reset > < text-argument >The following demonstrates how to create your own collation rules:
This sounds more complicated than it is in practice. For example, the following are equivalent ways of expressing the same thing: Notice that the order is important, as the subsequent item goes immediately after the text-argument. The following are not equivalent:a < b < c a < b & b < c a < c & a < b Either the text-argument must already be present in the sequence, or some initial substring of the text-argument must be present. (e.g. "a < b & ae < e" is valid since "a" is present in the sequence before "ae" is reset). In this latter case, "ae" is not entered and treated as a single character; instead, "e" is sorted as if it were expanded to two characters: "a" followed by an "e". This difference appears in natural languages: in traditional Spanish "ch" is treated as though it contracts to a single character (expressed as "c < ch < d"), while in traditional German "д" (a-umlaut) is treated as though it expands to two characters (expressed as "a & ae ; д < b").a < b & a < c a < c & a < b Ignorable Characters For ignorable characters, the first rule must start with a relation (the examples we have used above are really fragments; "a < b" really should be "< a < b"). If, however, the first relation is not "<", then all the all text-arguments up to the first "<" are ignorable. For example, ", - < a < b" makes "-" an ignorable character, as we saw earlier in the word "black-birds". In the samples for different languages, you see that most accents are ignorable. Normalization and Accents
The Errors The following are errors:
RuleBasedCollator throws
a ParseException .
Examples Simple: "< a < b < c < d" Norwegian: "< a,A< b,B< c,C< d,D< e,E< f,F< g,G< h,H< i,I< j,J < k,K< l,L< m,M< n,N< o,O< p,P< q,Q< r,R< s,S< t,T < u,U< v,V< w,W< x,X< y,Y< z,Z < е=a?,Е=A? ;aa,AA< ж,Ж< ш,Ш"
Normally, to create a rule-based Collator object, you will use
Or:String Simple = "< a < b < c < d"; RuleBasedCollator mySimple = new RuleBasedCollator(Simple); String Norwegian = "< a,A< b,B< c,C< d,D< e,E< f,F< g,G< h,H< i,I< j,J" + "< k,K< l,L< m,M< n,N< o,O< p,P< q,Q< r,R< s,S< t,T" + "< u,U< v,V< w,W< x,X< y,Y< z,Z" + "< е=a?,Е=A?" + ";aa,AA< ж,Ж< ш,Ш"; RuleBasedCollator myNorwegian = new RuleBasedCollator(Norwegian);
Combining // Create an en_US Collator object RuleBasedCollator en_USCollator = (RuleBasedCollator) Collator.getInstance(new Locale("en", "US", "")); // Create a da_DK Collator object RuleBasedCollator da_DKCollator = (RuleBasedCollator) Collator.getInstance(new Locale("da", "DK", "")); // Combine the two // First, get the collation rules from en_USCollator String en_USRules = en_USCollator.getRules(); // Second, get the collation rules from da_DKCollator String da_DKRules = da_DKCollator.getRules(); RuleBasedCollator newCollator = new RuleBasedCollator(en_USRules + da_DKRules); // newCollator has the combined rules
Another more interesting example would be to make changes on an existing
table to create a new // Create a new Collator object with additional rules String addRules = "& C < ch, cH, Ch, CH"; RuleBasedCollator myCollator = new RuleBasedCollator(en_USCollator + addRules); // myCollator contains the new rules The following example demonstrates how to change the order of non-spacing accents, // old rule String oldRules = "=?;?;?;?" // main accents + ";?;?;?;?" // main accents + ";?;?;?;?" // main accents + ";?;?;?;?" // main accents + ";?;?;?;?" // main accents + "< a , A ; ae, AE ; ж , Ж" + "< b , B < c, C < e, E & C < d, D"; // change the order of accent characters String addOn = "& ? ; ? ; ?"; RuleBasedCollator myCollator = new RuleBasedCollator(oldRules + addOn);
The last example shows how to put new primary ordering in before the
default setting. For example, in Japanese // get en_US Collator rules RuleBasedCollator en_USCollator = (RuleBasedCollator)Collator.getInstance(Locale.US); // add a few Japanese character to sort before English characters // suppose the last character before the first base letter 'a' in // the English collation rule is ? String jaString = "& ? < ?, ? < ?, ?"; RuleBasedCollator myJapaneseCollator = new RuleBasedCollator(en_USCollator.getRules() + jaString);
![]() public RuleBasedCollator(String rules) throws ParseException
![]() public String getRules()
![]() public CollationElementIterator getCollationElementIterator(String source)
![]() public int compare(String source, String target)
![]() public CollationKey getCollationKey(String source)
![]() public Object clone() ![]() public boolean equals(Object obj)
![]() public int hashCode() All Packages Class Hierarchy This Package Previous Next Index Submit a bug or feature |
||||||||||||||||
With any suggestions or questions please feel free to contact us |