I am trying to make a function in python which returns the w-shingling , of a given shingle width, w, but would like the strings in the shingled list to be all lower case letters.
I have tried putting [c.lower() for c in inputFile] and things of this sort.
import io
sample_text = io.StringIO("This is a sample text. It is a ordinary string but simulated to act as the contents of a file")
def wShingleOneFile(inputFile, w):
for line in inputFile:
words = line.split()
[c.lower() for c in inputFile]
return [words[i:i + w] for i in range(len(words) - w + 1)]
print(wShingleOneFile(sample_text, 3))
This is the ouptut when printed:
[['This', 'is', 'a'], ['is', 'a', 'sample'], ['a', 'sample', 'text.'], ['sample', 'text.', 'It'], ['text.', 'It', 'is'], ['It', 'is', 'a'], ['is', 'a', 'ordinary'], ['a', 'ordinary', 'string'], ['ordinary', 'string', 'but'], ['string', 'but', 'simulated'], ['but', 'simulated', 'to'], ['simulated', 'to', 'act'], ['to','act', 'as'], ['act', 'as', 'the'], ['as', 'the', 'contents'], ['the', 'contents', 'of'], ['contents', 'of', 'a'], ['of', 'a', 'file']]
But I would like all of these letters to be lowercase.
Change line.split() to line.lower().split()
Also note that strings in python are immutable so for example in the sample you gave, you need to assign your list comprehension [c.lower() ... inputFile] back to inputFile. It should also in that case be transformed before the loop that you currently show it written in.