Bir web kazıyıcı yazmaya çalışıyorum. Ben bir satırdaki tüm hücreleri almak istiyorum. Ben istiyorum bir önce satır, düz metin değeri olarak THOROUGHBRED TOPLANTILAR vardır. Ben başarıyla bu satır alabilirsiniz. Ama hücreler veya <td>
etiketleri sonraki satırın çocuklarını almak için nasıl anlamaya olamaz.
if ($foundTag = FindTagByText("THOROUGHBRED MEETINGS", $html))
{
$cell = $foundTag->parent();
$row = $cell->parent();
$nextRow = $row->next_sibling();
echo "Row: ".$row->plaintext."<br />\n";
echo "Next Row: ".$nextRow->plaintext."<br />\n";
$cells = $nextRow->children();
foreach ($cells as $cell)
{
echo "Cell: ".$cell->plaintext."<br />\n";
}
}
function FindTagByText($text, $html)
{
// Use Simple_HTML_DOM special selector 'text'
// to retrieve all text nodes from the document
$textNodes = $html->find('text');
$foundTag = null;
foreach($textNodes as $textNode)
{
if($textNode->plaintext == $text)
{
// Get the parent of the text node
// (A text node is always a child of
// its container)
$foundTag = $textNode->parent();
break;
}
}
return $foundTag;
}
İşte ayrıştırmak çalışıyorum html:
<tr valign=top>
<td colspan=16 bgcolor=#999999><b>THOROUGHBRED MEETINGS</b></td>
</tr>
<tr valign=top bgcolor="#ffffff">
<td><b>BR</b> <a href="meeting?mtg=br&day=today&curtype=0">SUNSHINE COAST</a></td>
<td>FINE/DEAD</b></td>
<td><font color=#cc0000><b>R1</b></font>@<b>12:30pm</b></td>
<td align=center bgcolor=#cc0000><a href="odds?mting=BR01000"><b><font color=#ffffff>1</a></font></td>
<td align=center><a href="odds?mting=BR02000"><b><font color=black>2</b></font></a></td>
<td align=center><a href="odds?mting=BR03000"><b><font color=black>3</b></font></a></td>
<td align=center><a href="odds?mting=BR04000"><b><font color=black>4</b></font></a></td>
<td align=center><a href="odds?mting=BR05000"><b><font color=black>5</b></font></a></td>
<td align=center><a href="odds?mting=BR06000"><b><font color=black>6</b></font></a></td>
<td align=center><a href="odds?mting=BR07000"><b><font color=black>7</b></font></a></td>
<td align=center><a href="odds?mting=BR08000"><b><font color=black>8</b></font></a></td>
<td bgcolor="#ffffff" colspan=4> </td>
</tr>
İşte benim çıkış:
Row: THOROUGHBRED MEETINGS Next Row: BR SUNSHINE COAST FINE/DEAD R1@12:30pm 1 2 3 4 5 6 7 8 CR NEW ZEALAND FINE/DEAD R3@11:10am 1 2 3 4 5 6 7 8 9 DR HOBART OCAST/HVY R1@12:15pm 1 2 3 4 5 6 7 MR CRANBOURNE OCAST/SLOW R1@12:20pm 1 2 3 4 5 6 7 8 NR COFFS HARBOUR OCAST/SLOW R1@12:45pm 1 2 3 4 5 6 7 8 SR MORUYA FINE/GOOD R1@12:25pm 1 2 3 4 5 6 7 8 VR BENALLA OCAST/SLOW R1@12:35pm 1 2 3 4 5 6 7 8 XR KALGOORLIE FINE/GOOD R1@ 3:00pm 1 2 3 4 5 6 7 HARNESS MEETINGS DT LAUNCESTON SHWRY/GOOD R1@ 4:57pm 1 2 3 4 5 6 7 8 9 10 MT CRANBOURNE OCAST/GOOD R1@ 5:05pm 1 2 3 4 5 6 7 8 GREYHOUND MEETINGS AD GAWLER OCAST/GOOD R1@ 5:10pm 1 2 3 4 5 6 7 8 9 10 11 CD CANBERRA OCAST/GOOD R1@ 5:02pm 1 2 3 4 5 6 7 8 9 10 11 MD SALE FINE/GOOD R1@ 4:54pm 1 2 3 4 5 6 7 8 9 10 11 12 Cell: BR SUNSHINE COAST Cell: FINE/DEAD Cell: R1@12:30pm Cell: 1 2 3 4 5 6 7 8 CR NEW ZEALAND FINE/DEAD R3@11:10am 1 2 3 4 5 6 7 8 9 DR HOBART OCAST/HVY R1@12:15pm 1 2 3 4 5 6 7 MR CRANBOURNE OCAST/SLOW R1@12:20pm 1 2 3 4 5 6 7 8 NR COFFS HARBOUR OCAST/SLOW R1@12:45pm 1 2 3 4 5 6 7 8 SR MORUYA FINE/GOOD R1@12:25pm 1 2 3 4 5 6 7 8 VR BENALLA OCAST/SLOW R1@12:35pm 1 2 3 4 5 6 7 8 XR KALGOORLIE FINE/GOOD R1@ 3:00pm 1 2 3 4 5 6 7 HARNESS MEETINGS DT LAUNCESTON SHWRY/GOOD R1@ 4:57pm 1 2 3 4 5 6 7 8 9 10 MT CRANBOURNE OCAST/GOOD R1@ 5:05pm 1 2 3 4 5 6 7 8 GREYHOUND MEETINGS AD GAWLER OCAST/GOOD R1@ 5:10pm 1 2 3 4 5 6 7 8 9 10 11 CD CANBERRA OCAST/GOOD R1@ 5:02pm 1 2 3 4 5 6 7 8 9 10 11 MD SALE FINE/GOOD R1@ 4:54pm 1 2 3 4 5 6 7 8 9 10 11 12