Merge muiltple tiff into one pdf using itextsharp based on lastwriteaccess

193 Views Asked by At

Don't know really how to go about this? I can convert one tif to one pdf. I can convert all tifs in one directory into one pdf. What I want to do is convert a group of tifs based on their lastwriteaccess or createddate or modifieddate.

For example, if I have 7 tifs in one directory where 3 have the same timestamp and 4 have another same timestamp, I want to merge the 3 into one pdf then merge the other 4 into another pdf. I'm kind of stuck on how to approach this. Do I need to create list of all the files then group them or can I merge 3 then go the next group merge those etc, etc, etc using a for each?

The code below is what I'm using to collect the first 5 files:

Dim dir As New DirectoryInfo(tiffPath)
Dim files As List(Of FileInfo) = 
dir.GetFiles("*.tif").OrderByDescending(Function(fc) 
fc.LastAccessTime).Take(5).ToList

For Each lfi As FileInfo In files
MsgBox(lfi.Name)
Next
1

There are 1 best solutions below

2
Andrew Morton On

It looks like it would be sufficient to bunch files together if their timestamps differ by less than some timespan.

So, if you order the files by their .LastWriteTimeUtc then you can iterate over that list and check how long it was between one and the previous one. If the gap is small then add it to the current list, otherwise start a new list.

I tested the following code on a directory with a random selection of files, so 30 days was an appropriate timespan for that, it looks like maybe two or three seconds would be good for your use:

Option Infer On
Option Strict On

Imports System.IO

Module Module1

    ''' <summary>
    ''' Get FileInfos bunched by virtue of having less than some time interval between their consecutive LastWriteTimeUtc when ordered by that.
    ''' </summary>
    ''' <param name="srcDir">Directory to get files from.</param>
    ''' <param name="adjacencyLimit">The allowable timespan to count as in the same bunch.</param>
    ''' <returns>A List(Of List(Of FileInfo). Each outer list has consecutive LastWriteTimeUtc differences less than some time interval.</returns>
    Function GetTimeAdjacentFiles(srcDir As String, adjacencyLimit As TimeSpan) As List(Of List(Of FileInfo))
        Dim di = New DirectoryInfo(srcDir)
        Dim fis = di.GetFiles().OrderBy(Function(fi) fi.LastWriteTimeUtc)

        If fis.Count = 0 Then
            Return Nothing
        End If

        Dim bins As New List(Of List(Of FileInfo))
        Dim thisBin As New List(Of FileInfo) From {(fis(0))}

        For i = 1 To fis.Count - 1
            If fis(i).LastWriteTimeUtc - fis(i - 1).LastWriteTimeUtc < adjacencyLimit Then
                thisBin.Add(fis(i))
            Else
                bins.Add(thisBin)
                thisBin = New List(Of FileInfo) From {fis(i)}
            End If
        Next

        bins.Add(thisBin)

        Return bins

    End Function

    Sub Main()
        Dim src = "E:\temp"
        'TODO: choose a suitable TimeSpan, e.g. TimeSpan.FromSeconds(3)
        Dim adjacencyLimit = TimeSpan.FromDays(30)
        Dim x = GetTimeAdjacentFiles(src, adjacencyLimit)

        For Each b In x
            Console.WriteLine("***********")
            For Each fi In b
                'TODO: merge each fi into a PDF.
                Console.WriteLine(fi.Name)
            Next
        Next

        Console.ReadLine()

    End Sub

End Module

I suggest two or three seconds because if the files have been stored on a FAT-type (e.g. FAT32 or exFAT, as can be used on USB memory sticks, old disk drives, and such) filesystem then the resolution of the timestamp will have been two seconds.