Question

Zamanında oluşturulan dizeleri PHP ile preg_replace () kullanırken, bir) preg_quote (kullanarak arama dizesi (örneğin '$' veya '+' gibi) özel regex karakterleri koruyabilirsiniz. Ama değiştirme dizesi bu işlemek için doğru yolu nedir? Örneğin bu kodu alın:

<?php

$haystack = '...a bit of sample text...';
$replacement = '\\HELLO WORLD$1.+-';
$replacement_quoted = preg_quote($replacement);

var_dump('--replacement', $replacement, '--replacement_quoted',
    $replacement_quoted, '--haystack', $haystack);

$result1 = preg_replace("@(bit) (of) (sample)@is", "\${1}" . $replacement ."$3", $haystack);
$result2 = preg_replace("@(bit) (of) (sample)@is", "\${1}" . $replacement_quoted ."$3", $haystack);

$replacement_new1 = str_replace('$', '\$', $replacement);
$replacement_new2 = str_replace('\\', '\\\\', $replacement_new1);

$result3 = preg_replace("@(bit) (of) (sample)@is", "\${1}" . $replacement_new1 ."$3", $haystack);
$result4 = preg_replace("@(bit) (of) (sample)@is", "\${1}" . $replacement_new2 ."$3", $haystack);

var_dump('--result1 (not quoted)', $result1, '--result2 (quoted)', $result2,
    '--result3 ($ escaped)', $result3, '--result4 (\ and $ escaped)', $result3);

?>

İşte çıkış bulunuyor:

string(13) "--replacement"
string(17) "\HELLO WORLD$1.+-"
string(20) "--replacement_quoted"
string(22) "\\HELLO WORLD\$1\.\+\-"
string(10) "--haystack"
string(26) "...a bit of sample text..."
string(22) "--result1 (not quoted)"
string(40) "...a bit\HELLO WORLDbit.+-sample text..."
string(18) "--result2 (quoted)"
string(42) "...a bit\HELLO WORLD$1\.\+\-sample text..."
string(21) "--result3 ($ escaped)"
string(39) "...a bit\HELLO WORLD$1.+-sample text..."
string(27) "--result4 (\ and $ escaped)"
string(39) "...a bit\HELLO WORLD$1.+-sample text..."

As you can see, you can't win with preg_quote(). If you don't call it and just pass the string in unmodified (result1), anything that looks like a capture token ($1 above) gets replaced with whatever the corresponding capture group contained. If you do call it (result2), you have no problems with the capture groups, but any other special PCRE characters (such as *) get escaped as well, and the escaped characters manage to live on in the output. Also interesting to me is that both versions produce a single \ in the output.

Sadece elle karakterleri alıntı yaparak, özellikle, $, sen bu işe alabilirsiniz. Bu result3 ve sonuç4'ü görülebilir. \ Ile oddness sürdüren, ancak, her ikisi için \ kaçan ekler result3, ve sonuç4'ü tekrar çıktı tek \ üretirler. Değiştirme dize başında altı \ karakterler ekleyerek result1, result3 ve sonuç4'ü için son çıktıda sadece iki \ üretir ve bunların üç result2 için.

Yani, çoğu konularda elle $ karakterini kaçan tarafından halledilir gibi görünüyor. \ Karakteri de kaçtı gerekiyor gibi görünüyor, ama ben happing tam olarak ne olduğunu anlamaya o bir biraz daha düşünmek gerekir. Her durumda, bu tüm oldukça çirkin - sataşmak \ $ {1} sözdizimi ve elle bazı karakterleri kaçmak zorunda arasındaki, kod sadece gerçekten çürümüş ve hata eğilimli kokuyor. Kaçırdığım bir şey var mı? Bunu yapmak için temiz bir yolu var mı?

Değiştirme dizesindeki özel gebelik karakterleri işleme kaçınmak

0 Cevap

etiketler