[linux-audio-dev] PTAF link and comments
    Sami P Perttu 
    perttu at cc.helsinki.fi
       
    Fri Feb  7 02:58:00 UTC 2003
    
    
  
On Thu, 6 Feb 2003, David Olofson wrote:
> Would be interesting to know which ASCII values are valid inside
> multibyte charatcers, BTW. Is there a risk you'll see false slashes,
> colons and things like that in paths, if you don't parse the UTF-8
> properly? (There isn't IIRC, but I'll have to read up on this.)
No. All bytes inside a multibyte character have their highest bit set.
That is one good thing about UTF-8: even structured documents can be
parsed without precise knowledge of the encoding, as long as it is
backwards compatible with ASCII. UTF-8 is pretty easy to split, too: any
byte that does not mark a new character has the bit pattern 10xxxxxx.
--
Sami Perttu                       "Flower chase the sunshine"
Sami.Perttu at hiit.fi               http://www.cs.helsinki.fi/u/perttu
    
    
More information about the Linux-audio-dev
mailing list