Joining text from parsing a complex table structure in ruby nokogiri

39 Views Asked by At

I have an HTML table and I want to get the text from some td's. Now sometime the text is in single td but sometimes its spread into multiple td's. How can I join the text in case if its spread in multiple td's. Here is the HTML code

    <table class="detailRecordTable">                                       
    <tbody>
  <tr><td class="detailSeperator" colspan="6">&nbsp;</td></tr>
  <tr>
    <td valign="top" style="width: 11% " class="detailData"><b>02/03/2016</b></td>      <td style="width: 3%" class="detailLabels" valign="top">&nbsp;</td> 
    <td style="width: 85%" class="detailData alignData" colspan="3">                                <b>Disposed- Pet for Writ Denied</b>    /td>
<td style="width: 1%" class="detailData">   &nbsp;</td>
    </tr>
 <tr>
<td colspan="2" style="width: 14% " class="detailLabels" valign="top">&nbsp;</td>
    <td style="width: 86%  " class="detailData" colspan="2">ORDER ISSUED:  PETITION FOR WRIT OF MANDAMUS DENIED. MANDATE AVAILABLE TO COUNSEL OF RECORD VIA SECURE CASE.NET.</td>
</tr>
                                    
<tr><td class="detailSeperator" colspan="6">&nbsp;</td></tr>
<tr>
    <td valign="top" style="width: 11% " class="detailData"><b>01/29/2016</b></td>
<td style="width: 3%" class="detailLabels" valign="top">&nbsp;</td> 
<td style="width: 85%" class="detailData alignData" colspan="3">
<b>Suggestions in Opposition</b></td>
<td style="width: 1%" class="detailData">   &nbsp;</td>
</tr>
<tr>
    <td colspan="2" style="width: 14% " class="detailLabels" valign="top">&nbsp;</td>
    <td style="width: 86%  " class="detailData" colspan="2">SUGGESTIONS IN OPPOSITION TO RELATORS PETITION FOR WRIT OF MANDAMUS; Electronic Filing Certificate of Service.</td>
    </tr>
<tr>
<td colspan="2" style="width: 14%" class="detailLabels">&nbsp;</td>
<td style="width: 86%" class="detailData" colspan="2">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<b>Filed By:</b>JOHN RICHARD SHANK JR
    </td>
</tr><tr>
    <td style="width: 14%" class="detailLabels" colspan="2"></td>
    <td style="width: 86%" class="detailData" colspan="2">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<b>On Behalf Of:</b>ELIZABETH DAVIS 
    </td>
 </tr>
<tr>
<td class="detailSeperator" colspan="6">&nbsp;</td></tr>
    <tr><td valign="top" style="width: 11% " class="detailData"><b>01/22/2016</b></td><td style="width: 3%" class="detailLabels" valign="top">&nbsp;</td>   
<td style="width: 85%" class="detailData alignData" colspan="3"><b>Court Order Issued</b></td>
    <td style="width: 1%" class="detailData">&nbsp;</td>
    </tr>
 <tr><td colspan="2" style="width: 14% " class="detailLabels" valign="top">&nbsp;</td>
<td style="width: 86%  " class="detailData" colspan="2">ORDER ISSUED: RESPONDENT REQUESTED TO FILE SUGGESTIONS IN OPPOSITION ON OR BEFORE 2:00 P.M. ON JANUARY 29, 2016.</td>
</tr>
</tbody></table>                                                                                    

I want the output like this,I put the asterisks around where the text should be joined

["ORDER ISSUED:  PETITION FOR WRIT OF MANDAMUS DENIED. MANDATE AVAILABLE TO COUNSEL OF RECORD VIA SECURE CASE.NET."   ,  "**SUGGESTIONS IN OPPOSITION TO RELATORS PETITION FOR WRIT OF MANDAMUS; Electronic Filing Certificate of Service. Filed By:JOHN RICHARD SHANK JR  On Behalf Of:ELIZABETH DAVIS**"   ,   "ORDER ISSUED: RESPONDENT REQUESTED TO FILE SUGGESTIONS IN OPPOSITION ON OR BEFORE 2:00 P.M. ON JANUARY 29, 2016"]                 

I have tried this but it not joining the text and I'm getting the text like a separate item, especially the text surrounded by asterisks

if !tr.css('td.detailData').empty?
      ac_desc = tr.css('td.detailData')[0].text.strip.gsub("\n", '').gsub("\t", '') 
    end      
    if ac_desc != ""
           acc_descs << ac_desc
    end  
                          
0

There are 0 best solutions below