better-files is a dependency-free pragmatic thin Scala wrapper around Java NIO
Basically: scala.io.Source, done right. It’s simple enough that the user manual fits in the github landing page.
Implementation
object BetterFiles extends FileReader { override def consume(path: Path): Result = path.toFile.toScala.lineIterator.foldLeft(LineMetricsAccumulator.empty)(_ addLine _).asResult override def description: String = "better-files"; }
Ergonomics 😀
One of the really nice parts of this library is that the various iterators it creates close themselves when the end of the file is reached or an exception occurs. This makes the code much simpler to write, and helps lessen the creation/cleanup coupling issues that crop up when using the standard library version.
Safety 😕
Unfortunately, their abstractions leak a bit – partially because the difference between this and a vanilla iterator isn’t represented in the type system. This means that it’d be easy to mix them up when migrating and forget to close a resource.
The other big issue is that it’s not intuitive that partial iteration doesn’t close the underlying resource. This is unfortunately something of a necessity, and comes down to the issues inherent in building an API over a mutable resource like Iterator while exposing that resource.
For example: which iterator should close the file in this example, someEvens or someOdds?
val (evens, odds) = path.toFile.toScala .lineIterator .map(_.toInt) .partition(_ % 2 == 0) val someEvens = evens.take(5) val someOdds = odds.take(5)
It’s not really a choice that can be made without out of band knowledge of the file and program, so it makes sense that this was punted to the user. Unfortunately, the types don’t reflect this, so it’s easy to lose track of what iterators close the underlying file and which don’t.
Performance
Unsurprisingly, the performance characteristics of Better Files are nearly identical to those of the Standard Library. Other than that, there’s not much to say about it, other than what was already said about the Scala Standard Library in part 2.
library | env | wall clock (mm:ss ± %) | % of best in env | % of best | % of reference | % change from local |
---|---|---|---|---|---|---|
Scala StdLib | local | 00:36.643 ± 1.91 % | 100.00 % | 100.00 % | 20.34 % | 0.00 % |
better-files | local | 00:36.818 ± 2.46 % | 100.48 % | 100.48 % | 20.44 % | 0.00 % |
Scala StdLib | EC2 | 02:02.973 ± 8.83 % | 100.00 % | 335.59 % | 68.26 % | 235.59 % |
better-files | EC2 | 02:04.564 ± 3.09 % | 101.29 % | 339.93 % | 69.14 % | 238.32 % |
Java StdLib | EC2 | 03:00.161 ± 23.98 % | 146.50 % | 491.66 % | 100.00 % | 131.71 % |
Memory Usage
Memory usage was also a very close match to the Standard Library. Turns out it’s a very thin wrapper.
library | env | peak memory used (mb ± %) | % of best in env | % of best | % of reference |
---|---|---|---|---|---|
Java StdLib | EC2 | 328.89 ± 9.71 % | 100.00 % | 102.30 % | 100.00 % |
better-files | EC2 | 365.59 ± 0.07 % | 111.16 % | 113.71 % | 111.16 % |
Scala StdLib | EC2 | 365.64 ± 0.06 % | 111.17 % | 113.73 % | 111.17 % |
Scala StdLib | local | 916.20 ± 7.59 % | 284.97 % | 284.97 % | 278.57 % |
better-files | local | 920.19 ± 9.03 % | 286.21 % | 286.21 % | 279.79 % |
Conclusion
For simple transformations, the usability boost over the Scala Standard Library makes such a difference that there isn’t really any reason not to use Better Files. For more complicated transformations, it’s probably easier to use one of the more expressive libraries to avoid the safety issues.
Up next: a very expressive library