http://xml.apache.org/http://www.apache.org/http://www.w3.org/

Home

Readme
Installation
Build

API Docs
Samples
Programming
Migration
FAQs

Releases
Feedback
Bug-Todo

Download
CVS Repository
Mail Archive

API Docs for SAX and DOM
 

Main Page   Class Hierarchy   Alphabetical List   Compound List   File List   Compound Members   File Members  

XMLString.hpp

Go to the documentation of this file.
00001 /*
00002  * The Apache Software License, Version 1.1

00003  *
00004  * Copyright (c) 1999-2000 The Apache Software Foundation.  All rights

00005  * reserved.

00006  *
00007  * Redistribution and use in source and binary forms, with or without

00008  * modification, are permitted provided that the following conditions

00009  * are met:

00010  *
00011  * 1. Redistributions of source code must retain the above copyright

00012  *    notice, this list of conditions and the following disclaimer.

00013  *
00014  * 2. Redistributions in binary form must reproduce the above copyright

00015  *    notice, this list of conditions and the following disclaimer in

00016  *    the documentation and/or other materials provided with the

00017  *    distribution.

00018  *
00019  * 3. The end-user documentation included with the redistribution,

00020  *    if any, must include the following acknowledgment:

00021  *       "This product includes software developed by the

00022  *        Apache Software Foundation (http://www.apache.org/)."

00023  *    Alternately, this acknowledgment may appear in the software itself,

00024  *    if and wherever such third-party acknowledgments normally appear.

00025  *
00026  * 4. The names "Xerces" and "Apache Software Foundation" must

00027  *    not be used to endorse or promote products derived from this

00028  *    software without prior written permission. For written

00029  *    permission, please contact apache\@apache.org.

00030  *
00031  * 5. Products derived from this software may not be called "Apache",

00032  *    nor may "Apache" appear in their name, without prior written

00033  *    permission of the Apache Software Foundation.

00034  *
00035  * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED

00036  * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES

00037  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE

00038  * DISCLAIMED.  IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR

00039  * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,

00040  * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT

00041  * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF

00042  * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND

00043  * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,

00044  * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT

00045  * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF

00046  * SUCH DAMAGE.

00047  * ====================================================================

00048  *
00049  * This software consists of voluntary contributions made by many

00050  * individuals on behalf of the Apache Software Foundation, and was

00051  * originally based on software copyright (c) 1999, International

00052  * Business Machines, Inc., http://www.ibm.com .  For more information

00053  * on the Apache Software Foundation, please see

00054  * <http://www.apache.org/>.

00055  */
00056 
00057 /*
00058  * $Log: XMLString.hpp,v $

00059  * Revision 1.15  2001/01/15 21:26:34  tng

00060  * Performance Patches by David Bertoni.

00061  *
00062  * Details: (see xerces-c-dev mailing Jan 14)

00063  * XMLRecognizer.cpp: the internal encoding string XMLUni::fgXMLChEncodingString

00064  * was going through this function numerous times.  As a result, the top hot-spot

00065  * for the parse was _wcsicmp().  The real problem is that the Microsofts wide string

00066  * functions are unbelievably slow.  For things like encodings, it might be

00067  * better to use a special comparison function that only considers a-z and

00068  * A-Z as characters with case.  This works since the character set for

00069  * encodings is limit to printable ASCII characters.

00070  *
00071  *  XMLScanner2.cpp: This also has some case-sensitive vs. insensitive compares.

00072  * They are also much faster.  The other tweak is to only make a copy of an attribute

00073  * string if it needs to be split.  And then, the strategy is to try to use a

00074  * stack-based buffer, rather than a dynamically-allocated one.

00075  *
00076  * SAX2XMLReaderImpl.cpp: Again, more case-sensitive vs. insensitive comparisons.

00077  *
00078  * KVStringPair.cpp & hpp: By storing the size of the allocation, the storage can

00079  * likely be re-used many times, cutting down on dynamic memory allocations.

00080  *
00081  * XMLString.hpp: a more efficient implementation of stringLen().

00082  *
00083  * DTDValidator.cpp: another case of using a stack-based buffer when possible

00084  *
00085  * These patches made a big difference in parse time in some of our test

00086  * files, especially the ones are very attribute-heavy.

00087  *
00088  * Revision 1.14  2000/10/13 22:47:57  andyh

00089  * Fix bug (failure to null-terminate result) in XMLString::trim().

00090  * Patch contributed by Nadav Aharoni

00091  *
00092  * Revision 1.13  2000/04/12 18:42:15  roddey

00093  * Improved docs in terms of what 'max chars' means in the method

00094  * parameters.

00095  *
00096  * Revision 1.12  2000/04/06 19:42:51  rahulj

00097  * Clarified how big the target buffer should be in the API

00098  * documentation.

00099  *
00100  * Revision 1.11  2000/03/23 01:02:38  roddey

00101  * Updates to the XMLURL class to correct a lot of parsing problems

00102  * and to add support for the port number. Updated the URL tests

00103  * to test some of this new stuff.

00104  *
00105  * Revision 1.10  2000/03/20 23:00:46  rahulj

00106  * Moved the inline definition of stringLen before the first

00107  * use. This satisfied the HP CC compiler.

00108  *
00109  * Revision 1.9  2000/03/02 19:54:49  roddey

00110  * This checkin includes many changes done while waiting for the

00111  * 1.1.0 code to be finished. I can't list them all here, but a list is

00112  * available elsewhere.

00113  *
00114  * Revision 1.8  2000/02/24 20:05:26  abagchi

00115  * Swat for removing Log from API docs

00116  *
00117  * Revision 1.7  2000/02/16 18:51:52  roddey

00118  * Fixed some facts in the docs and reformatted the docs to stay within

00119  * a reasonable line width.

00120  *
00121  * Revision 1.6  2000/02/16 17:07:07  abagchi

00122  * Added API docs

00123  *
00124  * Revision 1.5  2000/02/06 07:48:06  rahulj

00125  * Year 2K copyright swat.

00126  *
00127  * Revision 1.4  2000/01/12 00:16:23  roddey

00128  * Changes to deal with multiply nested, relative pathed, entities and to deal

00129  * with the new URL class changes.

00130  *
00131  * Revision 1.3  1999/12/18 00:18:10  roddey

00132  * More changes to support the new, completely orthagonal support for

00133  * intrinsic encodings.

00134  *
00135  * Revision 1.2  1999/12/15 19:41:28  roddey

00136  * Support for the new transcoder system, where even intrinsic encodings are

00137  * done via the same transcoder abstraction as external ones.

00138  *
00139  * Revision 1.1.1.1  1999/11/09 01:05:52  twl

00140  * Initial checkin

00141  *
00142  * Revision 1.2  1999/11/08 20:45:21  rahul

00143  * Swat for adding in Product name and CVS comment log variable.

00144  *
00145  */
00146 
00147 #if !defined(XMLSTRING_HPP)
00148 #define XMLSTRING_HPP
00149 
00150 #include <util/XercesDefs.hpp>
00151 
00152 class XMLLCPTranscoder;
00153 
00165 class  XMLString
00166 {
00167 public:
00168     /* Static methods for native character mode string manipulation */
00171 
00182     static void binToText
00183     (
00184         const   unsigned int    toFormat
00185         ,       char* const     toFill
00186         , const unsigned int    maxChars
00187         , const unsigned int    radix
00188     );
00189 
00200     static void binToText
00201     (
00202         const   unsigned int    toFormat
00203         ,       XMLCh* const    toFill
00204         , const unsigned int    maxChars
00205         , const unsigned int    radix
00206     );
00207 
00218     static void binToText
00219     (
00220         const   unsigned long   toFormat
00221         ,       char* const     toFill
00222         , const unsigned int    maxChars
00223         , const unsigned int    radix
00224     );
00225 
00236     static void binToText
00237     (
00238         const   unsigned long   toFormat
00239         ,       XMLCh* const    toFill
00240         , const unsigned int    maxChars
00241         , const unsigned int    radix
00242     );
00243 
00254     static void binToText
00255     (
00256         const   long            toFormat
00257         ,       char* const     toFill
00258         , const unsigned int    maxChars
00259         , const unsigned int    radix
00260     );
00261 
00272     static void binToText
00273     (
00274         const   long            toFormat
00275         ,       XMLCh* const    toFill
00276         , const unsigned int    maxChars
00277         , const unsigned int    radix
00278     );
00279 
00290     static void binToText
00291     (
00292         const   int             toFormat
00293         ,       char* const     toFill
00294         , const unsigned int    maxChars
00295         , const unsigned int    radix
00296     );
00297 
00308     static void binToText
00309     (
00310         const   int             toFormat
00311         ,       XMLCh* const    toFill
00312         , const unsigned int    maxChars
00313         , const unsigned int    radix
00314     );
00315 
00326     static bool textToBin
00327     (
00328         const   XMLCh* const    toConvert
00329         ,       unsigned int&   toFill
00330     );
00332 
00335 
00349     static void catString
00350     (
00351                 char* const     target
00352         , const char* const     src
00353     );
00354 
00367     static void catString
00368     (
00369                 XMLCh* const    target
00370         , const XMLCh* const    src
00371     );
00373 
00376 
00387     static int compareIString
00388     (
00389         const   char* const     str1
00390         , const char* const     str2
00391     );
00392 
00403     static int compareIString
00404     (
00405         const   XMLCh* const    str1
00406         , const XMLCh* const    str2
00407     );
00408 
00409 
00423     static int compareNString
00424     (
00425         const   char* const     str1
00426         , const char* const     str2
00427         , const unsigned int    count
00428     );
00429 
00443     static int compareNString
00444     (
00445         const   XMLCh* const    str1
00446         , const XMLCh* const    str2
00447         , const unsigned int    count
00448     );
00449 
00450 
00464     static int compareNIString
00465     (
00466         const   char* const     str1
00467         , const char* const     str2
00468         , const unsigned int    count
00469     );
00470 
00485     static int compareNIString
00486     (
00487         const   XMLCh* const    str1
00488         , const XMLCh* const    str2
00489         , const unsigned int    count
00490     );
00491 
00504     static int compareString
00505     (
00506         const   char* const     str1
00507         , const char* const     str2
00508     );
00509 
00521     static int compareString
00522     (
00523         const   XMLCh* const    str1
00524         , const XMLCh* const    str2
00525     );
00527 
00530 
00540     static void copyString
00541     (
00542                 char* const     target
00543         , const char* const     src
00544     );
00545 
00556     static void copyString
00557     (
00558                 XMLCh* const    target
00559         , const XMLCh* const    src
00560     );
00561 
00574     static bool copyNString
00575     (
00576                 XMLCh* const    target
00577         , const XMLCh* const    src
00578         , const unsigned int    maxChars
00579     );
00581 
00584 
00590     static unsigned int hash
00591     (
00592         const   char* const     tohash
00593         , const unsigned int    hashModulus
00594     );
00595 
00602     static unsigned int hash
00603     (
00604         const   XMLCh* const    toHash
00605         , const unsigned int    hashModulus
00606     );
00607 
00617     static unsigned int hashN
00618     (
00619         const   XMLCh* const    toHash
00620         , const unsigned int    numChars
00621         , const unsigned int    hashModulus
00622     );
00623 
00625 
00628 
00636     static int indexOf(const char* const toSearch, const char ch);
00637 
00646     static int indexOf(const XMLCh* const toSearch, const XMLCh ch);
00647 
00656     static int lastIndexOf(const char* const toSearch, const char ch);
00657 
00666     static int lastIndexOf(const XMLCh* const toSearch, const XMLCh ch);
00667 
00678     static int lastIndexOf
00679     (
00680         const   char* const     toSearch
00681         , const char            chToFind
00682         , const unsigned int    fromIndex
00683     );
00684 
00695     static int lastIndexOf
00696     (
00697         const   XMLCh* const    toSearch
00698         , const XMLCh           ch
00699         , const unsigned int    fromIndex
00700     );
00702 
00705 
00710     static void moveChars
00711     (
00712                 XMLCh* const    targetStr
00713         , const XMLCh* const    srcStr
00714         , const unsigned int    count
00715     );
00716 
00718 
00721 
00725     static char* replicate(const char* const toRep);
00726 
00731     static XMLCh* replicate(const XMLCh* const toRep);
00732 
00734 
00737 
00743     static bool startsWith
00744     (
00745         const   char* const     toTest
00746         , const char* const     prefix
00747     );
00748 
00755     static bool startsWith
00756     (
00757         const   XMLCh* const    toTest
00758         , const XMLCh* const    prefix
00759     );
00760 
00769     static bool startsWithI
00770     (
00771         const   char* const     toTest
00772         , const char* const     prefix
00773     );
00774 
00784     static bool startsWithI
00785     (
00786         const   XMLCh* const    toTest
00787         , const XMLCh* const    prefix
00788     );
00789 
00796     static const XMLCh* findAny
00797     (
00798         const   XMLCh* const    toSearch
00799         , const XMLCh* const    searchList
00800     );
00801 
00808     static XMLCh* findAny
00809     (
00810                 XMLCh* const    toSearch
00811         , const XMLCh* const    searchList
00812     );
00813 
00818     static unsigned int stringLen(const char* const src);
00819 
00824     static unsigned int stringLen(const XMLCh* const src);
00826 
00829 
00835     static void cut
00836     (
00837                 XMLCh* const    toCutFrom
00838         , const unsigned int    count
00839     );
00840 
00849     static char* transcode
00850     (
00851         const   XMLCh* const    toTranscode
00852     );
00853 
00870     static bool transcode
00871     (
00872         const   XMLCh* const    toTranscode
00873         ,       char* const     toFill
00874         , const unsigned int    maxChars
00875     );
00876 
00885     static XMLCh* transcode
00886     (
00887         const   char* const     toTranscode
00888     );
00889 
00901     static bool transcode
00902     (
00903         const   char* const     toTranscode
00904         ,       XMLCh* const    toFill
00905         , const unsigned int    maxChars
00906     );
00907 
00913     static void trim(char* const toTrim);
00914 
00920     static void trim(XMLCh* const toTrim);
00922 
00925 
00933     static XMLCh* makeUName
00934     (
00935         const   XMLCh* const    pszURI
00936         , const XMLCh* const    pszName
00937     );
00938 
00954     static unsigned int replaceTokens
00955     (
00956                 XMLCh* const    errText
00957         , const unsigned int    maxChars
00958         , const XMLCh* const    text1
00959         , const XMLCh* const    text2
00960         , const XMLCh* const    text3
00961         , const XMLCh* const    text4
00962     );
00963 
00968     static void upperCase(XMLCh* const toUpperCase);
00970 
00971 
00972 private :
00973     
00976 
00977     XMLString();
00979     ~XMLString();
00981 
00982 
00985 
00986     static void initString(XMLLCPTranscoder* const defToUse);
00987     static void termString();
00989     friend class XMLPlatformUtils;
00990 };
00991 
00992 
00993 // ---------------------------------------------------------------------------
00994 //  Inline some methods that are either just passthroughs to other string
00995 //  methods, or which are key for performance.
00996 // ---------------------------------------------------------------------------
00997 inline void XMLString::moveChars(       XMLCh* const    targetStr
00998                                 , const XMLCh* const    srcStr
00999                                 , const unsigned int    count)

01000 {
01001     XMLCh* outPtr = targetStr;
01002     const XMLCh* inPtr = srcStr;
01003     for (unsigned int index = 0; index < count; index++)
01004         *outPtr++ = *inPtr++;
01005 }
01006 
01007 inline unsigned int XMLString::stringLen(const XMLCh* const src)

01008 {
01009     if (src == 0 || *src == 0)
01010     {
01011         return 0;
01012    }
01013     else
01014    {
01015         const XMLCh* pszTmp = src + 1;
01016 
01017         while (*pszTmp)
01018             ++pszTmp;
01019 
01020         return (unsigned int)(pszTmp - src);
01021     }
01022 }
01023 
01024 inline bool XMLString::startsWith(  const   XMLCh* const    toTest
01025                                     , const XMLCh* const    prefix)

01026 {
01027     return (compareNString(toTest, prefix, stringLen(prefix)) == 0);
01028 }
01029 
01030 inline bool XMLString::startsWithI( const   XMLCh* const    toTest
01031                                     , const XMLCh* const    prefix)

01032 {
01033     return (compareNIString(toTest, prefix, stringLen(prefix)) == 0);
01034 }
01035 
01036 inline XMLCh* XMLString::replicate(const XMLCh* const toRep)

01037 {
01038     // If a null string, return a null string!
01039     XMLCh* ret = 0;
01040     if (toRep)
01041     {
01042         const unsigned int len = stringLen(toRep);
01043         ret = new XMLCh[len + 1];
01044         XMLCh* outPtr = ret;
01045         const XMLCh* inPtr = toRep;
01046         for (unsigned int index = 0; index <= len; index++)
01047             *outPtr++ = *inPtr++;
01048     }
01049     return ret;
01050 }
01051 
01052 #endif


Copyright © 2000 The Apache Software Foundation. All Rights Reserved.