I also will insert my five kopecks.
-
The %Library package also includes stream classes, but those are deprecated. The class library includes additional stream classes, but those are not intended for general use. Working with Streams
I have %Stream.FileCharacter was an order of magnitude faster than %[Library.]File
- If you rewrite the line-by-line reading to read blocks with further parsing of lines, the speed will more increase by an order of magnitude.
Sample
Class dc.test [ Abstract ] { ClassMethod ReadCSVStream(fCSV As %String) As %String { s stream = ##class(%Stream.FileCharacter).%New() d stream.LinkToFile(fCSV) s time1=$zh While 'stream.AtEnd { s line=stream.ReadLine($$$MaxLocalLength) } s diff=$zh-time1 q diff } ClassMethod ReadCSVStreamBlock(fCSV As %String) As %String { s stream = ##class(%Stream.FileCharacter).%New() d stream.LinkToFile(fCSV) s i=0,time1=$zh While 'stream.AtEnd { s chunks($i(i))=stream.Read($$$MaxLocalLength) // do parsing chunk to lines } s diff=$zh-time1 q diff } ClassMethod ReadCSVOURC(fCSV As %String) As %String { o fCSV::1 e q "Missing File" s eof=$zu(68,40,1) u fCSV s time1=$zh f { r line q:$zeof // do something with line } s diff=$zh-time1 c fCSV d $zu(68,40,eof) q diff } /// d ##class(dc.test).Test() ClassMethod Test() { s ptr=0, clnm=$classname() &sql(select %dlist(Name) into :list from %Dictionary.CompiledMethod where Parent=:clnm and Name %startswith 'ReadCSV' group by Parent order by SequenceNumber) while $listnext(list,ptr,m) { w !,"[",m,"]",?20,"execution: ",$classmethod(clnm,m,"data.csv"),! } } } - On the Internet, you can find a lot of materials about comparing the speed of reading files (in particular CSV) for different programming languages (Python, C/C++, R, C#, Java, etc.), for example (this is machine translation). Often, those who make such comparisons do not always know all these languages equally well, so sometimes casus happen.
Who do you think in the article above was faster when reading 1e7+ lines: Fortran or C++ ?
Fortran :) - If we approach the issue formally, then the advantage will be given to compiled languages, not interpreted ones, as well as the implementation that uses all the capabilities of the operating system and hardware.
- Log in to post comments