min_length & max_length utf-8 problem |
|||
|---|---|---|---|
| Date: | 06/29/2008 | Severity: | Minor |
| Status: | Resolved | Reporter: | Zver1992 |
| Version: | 1.6.3 | ||
| Keywords: | Libraries, Validation Class | ||
Description
Standart PHP function “strlen()“ count length of some utf-8 strings (russian language and some languages cyryllic too) wrong. This function count 1 russian symbol = 2. In manual ( http://www.php.net/strlen ) im found correctly function. Examples:
<?php
function strlen_utf8($str)
{
$i = 0;
$count = 0;
$len = strlen($str);
while($i < $len) {
$chr = ord($str[$i]);
$count++;
$i++;
if($i >= $len) {
break;
}
if($chr & 0x80) {
$chr <<= 1;
while($chr & 0x80) {
$i++;
$chr <<= 1;
}
}
}
return $count;
}
$string = ‘авбгд‘;
echo strlen($string); //10, wrong
echo strlen_utf8($string); //5, right
?>
Im replace all strlen’s for strlen_utf8 in my Validation Class and my script work correctly now ![]()
Code Sample
<?php
function min_length($str, $val)
{
if (preg_match("/[^0-9]/", $val))
{
return FALSE;
}
return (strlen($str) < $val) ? FALSE : TRUE;
}
$check = min_length('авбгд', 6);
?>
Expected Result
FALSE
Actual Result
TRUE
Comment on Bug Report
| Posted by: Sam Dark on 30 June 2008 10:29am | |
|
|
It’s not only with strlen. As I know, there are also problems with: trim, ltrim, rtrim |
| Posted by: inparo on 1 July 2008 8:46am | |
|
|
That is what the multibite string functions are for. So in your case it would be a simple mb_strlen($str, “UTF-8”). Remember that you can add your own validation rules by extending the class, so there’s no need to alter core files. |
| Posted by: Sam Dark on 1 July 2008 9:05am | |
|
|
inparo, multibite string functions aren’t perfect either. |
| Posted by: inparo on 1 July 2008 9:55am | |
|
|
No, but they solve the ‘bug’ outlined above. |
| Posted by: Mark (Germany) on 10 July 2008 5:28am | |
|
|
Woudn’t it be way nicer to let CodeI. handle it? thats, what you usually would expect using this functions. |
| Posted by: Sam Dark on 10 July 2008 5:34am | |
|
|
This can be really useful. |
