Benzer içerik için metin bloklarını Karşılaştırılması

0 Cevap php

I have two text blocks, containing company names. Both contain names of hundreds of companies, the newer list has more companies. How do I remove the duplicate company names from the two lists , so that I am left with the new names only? Sample text block one:

Company Name One, Address line 1, line 2, phone, email
..
Random text 
..

Company Name Two 
Address, Phone,email
..
Random text 
..
Company Name 3 
Address, Phone,email

Örnek metin bloğu iki:

..Random Text..
M/s Company Name One Extra Random Text, Address line 1, line 2, phone, 
..random text...
M/s Company Name Two 
Address, Phone
...

The Company name, address etc are similar not same. The second block has the words M/s before all company names. I would like to do this in php, using regex perhaps.

I would like to out put company names which match, for eg. in the example given above I would like to output that Company names: Company name One, Company name two are common to both test blocks.

Güncelleme: @Wrikken, iki dizeleri metni için teşekkürler. Ben M/s kullanarak ikinci bir blok patlayabilir, ve bir dizi alabilirsiniz. Nasıl sonra uzun bir dize ilk metin bloğunu maç için bu dizideki her bir öğeyi kontrol edebilirim?

Ben beri elle iş yaptık, ancak ben hala iki metin blokları benzerlik ve dolayısıyla lütuf karşılaştırıldığında nasıl bilmek istiyorum.

Update: Output for @Joyce Babu code

..Random Text.. ..Random Text.. ..Random Text.. ..Random Text.. ..Random Text.. ..Random Text.. ..Random Text.. ..Random Text.. ..Random Text.. ..Random Text.. M/s Company Name One Extra Random Text, Address line 1, line 2, phone, M/s Company Name One Extra Random Text, Address line 1, line 2, phone, M/s Company Name One Extra Random Text, Address line 1, line 2, phone, M/s Company Name One Extra Random Text, Address line 1, line 2, phone, M/s Company Name One Extra Random Text, Address line 1, line 2, phone, M/s Company Name One Extra Random Text, Address line 1, line 2, phone, M/s Company Name One Extra Random Text, Address line 1, line 2, phone, M/s Company Name One Extra Random Text, Address line 1, line 2, phone, M/s Company Name One Extra Random Text, Address line 1, line 2, phone, M/s Company Name One Extra Random Text, Address line 1, line 2, phone, M/s Company Name One Extra Random Text, Address line 1, line 2, phone, ..random text... ..random text... ..random text... ..random text... ..random text... ..random text... ..random text... ..random text... ..random text... ..random text... M/s Company Name Two M/s Company Name Two M/s Company Name Two M/s Company Name Two M/s Company Name Two M/s Company Name Two M/s Company Name Two M/s Company Name Two M/s Company Name Two M/s Company Name Two Address, Phone Address, Phone Address, Phone Address, Phone Address, Phone Address, Phone Address, Phone Address, Phone Address, Phone Address, Phone ... ... ... ... ... ... ... ... ... ... ... ... 

Output for @nikic

array(2) { [0]=>  string(17) "..Random Text.. " [4]=>  string(16) "Address, Phone " } 

Output for @Joyce Babu second post

andom Text..andom Text..andom Text..andom Text..andom Text..andom Text..andom Text..andom Text..andom Text..andom Text..Company Name One Extra Random Text, Address line 1, line 2, phone,Company Name One Extra Random Text, Address line 1, line 2, phone,Company Name One Extra Random Text, Address line 1, line 2, phone,Company Name One Extra Random Text, Address line 1, line 2, phone,Company Name One Extra Random Text, Address line 1, line 2, phone,Company Name One Extra Random Text, Address line 1, line 2, phone,Company Name One Extra Random Text, Address line 1, line 2, phone,Company Name One Extra Random Text, Address line 1, line 2, phone,Company Name One Extra Random Text, Address line 1, line 2, phone,Company Name One Extra Random Text, Address line 1, line 2, phone,Company Name One Extra Random Text, Address line 1, line 2, phone,andom text...andom text...andom text...andom text...andom text...andom text...andom text...andom text...andom text...andom text...Company Name TwoCompany Name TwoCompany Name TwoCompany Name TwoCompany Name TwoCompany Name TwoCompany Name TwoCompany Name TwoCompany Name TwoCompany Name Tworess, Phoneress, Phoneress, Phoneress, Phoneress, Phoneress, Phoneress, Phoneress, Phoneress, Phoneress, Phone

@ Joyce Babu Final Kodu

<?php
set_time_limit(500);
$arOld = file('olddata.txt');
$arNew = file('newdata.txt');
$G=0;
    $c=0;

    foreach($arNew as $line){
    if(substr($line, 0, 4) == 'M/s '){
    $c++;   
    echo "<BR/>".$c.".)";
        $line = trim(substr($line, 4));
        foreach($arOld as $old){
            similar_text($line, $old, $percentage);
            if ($percentage > 80){
                continue;
            }
        }
        echo $line;
    }else{
    $G++;
    }
}
echo "<br/>".$G . " DID NOT MATCH";
?>

Joyce Babu son kodundan @ Çıktı

1.)Company Name One Extra Random Text, Address line 1, line 2, phone,
2.)Company Name Two
4 DID NOT MATCH

0 Cevap