Question

Ben iki yarısı için HTML içeren bir dize bölmek için bir yol arıyorum. Koşullar:

Karakter bir dizi bir dize bölmek
Bir kelimenin ortasında bölünmediği gerekir
Nerede dize bölmek için hesaplanırken HTML karakter içermemelidir

Örneğin aşağıdaki dizeyi alır:

This is a test string that contains HTML tags vetext content. This string needs to be split without slicing through the middle of a word vemust preserve the validity of the HTML, i.e. not split in the middle of a tag, vemake sure closing tags are respected correctly.

: Ben kelime HTML (html sayma değil) orta, ben şu iki bölümden için dize bölmek işlevi isterim, karakter pozisyonunda 39 bölmek istiyorsanız söyleyin

This is a test string that contains HTML

ve

tags vetext content. This string needs to be split without slicing through the middle of a word vemust preserve the validity of the HTML, i.e. not split in the middle of a tag, vemake sure closing tags are respected correctly.

Notice in the above two example results I would require the the HTML validity be respected, so the closing  ve tags were added. Also a starting  tag was added to second half as one it closed at the end of the string.

I found this function on StackOverflow to truncate a string by a number of text chars vepreserve HTML, but it only goes halfway to want I need, as I need to split in to two halves.

function printTruncated($maxLength, $html)
{
    $printedLength = 0;
    $position = 0;
    $tags = array();

    while ($printedLength < $maxLength && preg_match('{</?([a-z]+)[^>]*>|&#?[a-zA-Z0-9]+;}', $html, $match, PREG_OFFSET_CAPTURE, $position))
    {
        list($tag, $tagPosition) = $match[0];

        // Print text leading up to the tag.
        $str = substr($html, $position, $tagPosition - $position);
        if ($printedLength + strlen($str) > $maxLength)
        {
            print(substr($str, 0, $maxLength - $printedLength));
            $printedLength = $maxLength;
            break;
        }

        print($str);
        $printedLength += strlen($str);

        if ($tag[0] == '&')
        {
            // Handle the entity.
            print($tag);
            $printedLength++;
        }
        else
        {
            // Handle the tag.
            $tagName = $match[1][0];
            if ($tag[1] == '/')
            {
                // This is a closing tag.

                $openingTag = array_pop($tags);
                assert($openingTag == $tagName); // check that tags are properly nested.

                print($tag);
            }
            else if ($tag[strlen($tag) - 2] == '/')
            {
                // Self-closing tag.
                print($tag);
            }
            else
            {
                // Opening tag.
                print($tag);
                $tags[] = $tagName;
            }
        }

        // Continue after the tag.
        $position = $tagPosition + strlen($tag);
    }

    // Print any remaining text.
    if ($printedLength < $maxLength && $position < strlen($html))
        print(substr($html, $position, $maxLength - $printedLength));

    // Close any open tags.
    while (!empty($tags))
        printf('</%s>', array_pop($tags));
}

Answer 1

Hemen hemen tüm diğer yanıtlar aktardığı olacak genel kural "regex ile HTML işlemek yok - tüm kenar durumlarda yakalamak değildir" dir

Ben bu oldukça doğru olduğuna inanıyorum

Her şey hatta biraz dize şekil bozukluğuna ve hatta iyi hazırlanmış düzenli ifade olacak hala mess it up

Bazı etiketleri ve diğerlerini (p-etiketleri, tüm sonra, etiketleri vardır ve ikiye bölünmüş bir arıyorsanız) bölmek istediğiniz almamak, süreci yeniden düşünmek gerekir, ve sen ne hakkında çok özel alabilirsiniz örneğin elde etmek isteyen Bir paragraf etiketi ortasında bölme tamam mı? Ne divlere hakkında? Orta noktası etiketi içinde ise, daha uzun olmak için ilk dize ya da ikinci istiyorsun?

Assuming that splitting paragraph tags is okay, but others aren't, try an approach as follows: (no copy-paste code here, sorry) * Strip the target string twice - once of all tags, and once of just paragraph tags * Find the middle point in the no-tags-at-all string * Split the no-tags-at-all string at first space after middle point * Find the spot in the just-p-tags-stripped string that matches the word/words just after the middle point in previous step - this should tell you where in the just-p-tags-stripped string is 'the middle' when tags are ignored * Check to see if you're inside a tag.

.. Aslında, ben bu noktaya geldi gibi ben yazdım ne 90% sorun tam olarak nerede son nokta-nokta açık ve oldukça belâ olduğunu fark

Ben başkalarına bir uyarı burada benim yarım mamul rant bırakın, ve kendime gidiyorum ..

Answer 2

Ben aynı zamanda bu cevap arayan insanlar için, biraz geç biliyorum:

<?php

$string = 'When one thinks of luxury travel throughout the Mediterranean, a handful of destinations are bound to be at the top of the list, destinations like the French Riviera, the Grecian Islands, or the Italian coastline.';

echo current(explode('::BR::', wordwrap($string, 20, '::BR::')));
?>

Bir kelime ile dilimleme ve PHP HTML koruyarak olmadan iki dizeleri bir HTML dize bölmek

2 Cevap

etiketler