Tags: utf8

Sort by: Date / Title /

  1. 1 year ago by spirit
    For example if you have accents in your souce files, you will have this problem : "Non-ASCII character '\xc3' errors"
    En en-tête du script, apres le shebang (#!/usr/bin/env python en général, ou #!/usr/bin/python), insérer la ligne suivante afin de forcer l'encodage a utf-8 (car python par défaut utilise ASCII):
    
    # -*- coding: utf-8 -*-
    Paste this in your website: <script type="text/javascript" src="http://www.posteet.com/embed/1569"></script>
  2. 1 year ago by spirit
    You can check and compare sort orders provided by these two collations here:
    
    http://www.collation-charts.org/mysql60/mysql604.utf8_general_ci.european.html
    http://www.collation-charts.org/mysql60/mysql604.utf8_unicode_ci.european.html
    
    utf8_general_ci is a very simple collation. What it does - it just
    - removes all accents
    - then converts to upper case
    and uses the code of this sort of "base letter" result letter to compare.
    
    For example, these Latin letters: ÀÁÅåāă (and all other Latin letters "a" with any accents and in any cases) are all compared as equal to "A".
    
    utf8_unicode_ci uses the default Unicode collation element table (DUCET).
    
    The main differences are:
    
    1. utf8_unicode_ci supports so called expansions and ligatures, for example: German letter ß (U+00DF LETTER SHARP S) is sorted near "ss" Letter Œ (U+0152 LATIN CAPITAL LIGATURE OE) is sorted near "OE".
    
    utf8_general_ci does not support expansions/ligatures, it sorts all these letters as single characters, and sometimes in a wrong order.
    
    2. utf8_unicode_ci is *generally* more accurate for all scripts. For example, on Cyrillic block: utf8_unicode_ci is fine for all these languages: Russian, Bulgarian, Belarusian, Macedonian, Serbian, and Ukrainian. While utf8_general_ci is fine only for Russian and Bulgarian subset of Cyrillic. Extra letters used in Belarusian, Macedonian, Serbian, and Ukrainian
    are sorted not well.
    
    +/- The disadvantage of utf8_unicode_ci is that it is a little bit slower than utf8_general_ci.
    
    So when you need better sorting order - use utf8_unicode_ci, and when you utterly interested in performance - use utf8_general_ci.
    Paste this in your website: <script type="text/javascript" src="http://www.posteet.com/embed/1340"></script>
  3. 2 years ago by advitam
    1. file -i <fichier texte> #donne l'encodage d'un fichier texte (charset=...) à partir de sa version 4.0
    Paste this in your website: <script type="text/javascript" src="http://www.posteet.com/embed/569"></script>
  4. sponsorised links
  5. 2 years ago by advitam
    1. iconv -f iso-8859-1 -t utf-8 <in >out    # vers UTF-8
    2. iconv -f utf-8 -t iso-8859-1 <in >out    # vers latin-1
    Paste this in your website: <script type="text/javascript" src="http://www.posteet.com/embed/568"></script>
  6. 2 years ago by neorom and saved by 1 other
    1. #!/bin/bash
    2. for i in `find . -type f -name "*.html"`
    3. do
    4.                 name=`basename $i`     
    5.                 cp $i /tmp/     
    6.                 cat /tmp/$name | iconv -f latin1 -t latin1 > $i
    7.                 cp $i /tmp/          
    8.                 cat /tmp/$name | iconv -f latin1 -t utf-8 > $i
    9. done   
    10.  
    11. for i in `find . -type f -name "*.php"`
    12. do
    13.                 name=`basename $i`
    14.                 cp $i /tmp/     
    15.                 cat /tmp/$name | iconv -f latin1 -t latin1 > $i  
    16.                 cp $i /tmp/     
    17.                 cat /tmp/$name | iconv -f latin1 -t utf-8 > $i
    18. done
    Paste this in your website: <script type="text/javascript" src="http://www.posteet.com/embed/14"></script>

First / Previous / Next / Last / Page 1 of 1 (5 posteets)