From: Oded Arbel (oded-linux_at_nonexisting.hamakor.org.il)
Date: Thu 17 Jun 2004 - 11:12:07 IDT
On Thursday 17 June 2004 08:25, Oron Peled wrote:
> On Wednesday 16 June 2004 11:58, Ilan Aisic wrote:
> > In particular, I'm interested in changing mutlibyte Hebrew to and from
> > HTML characters.
> > where:
> > The same in HTML ("א" is Alef, '.' is '.'):
>
> Ok, for this part of the question (nobody answered yet), why not use sed?
> Write the following script:
> #! /bin/sed -f
> s/<Alef>/\&\#1488;/g
> s/<Beit>/\&\#1489;/g
> ...
It should be relativly easy to do so with perl, something like
cat source | perl -CI -pe 'use utf8; s/א/א/g ... '
after you make sure <source> is in utf-8 (using iconv if required)
or even
cat source | perl -CI -pe 'use utf8; s/([א-ת])/"&#" . ord($1) . ";"/eg'
to save on the typing :-)
-- Oded ::.. You can always tell luck from ability by its duration. ================================To unsubscribe, send mail to linux-il-request_at_linux.org.il with the word "unsubscribe" in the message body, e.g., run the command echo unsubscribe | mail linux-il-request_at_linux.org.il
This archive was generated by hypermail 2.1.7 : Thu 17 Jun 2004 - 12:24:46 IDT