This originally started with this SO question, and I’ll be honest I was flummoxed for a couple of days looking at this (in no small part because the code was doing a lot). But at some point I was able to isolate all issues down to dataFrame.saveToCassandra. Every 5 or so runs I’d get one or two errors:
or
I could write this data to a file with no issue, but when writing to a Cassandra these errors would come flying out at me. With the help of Russ Spitzer (by help I mean explaining to my thick skull what was going on) I was pointed to the fact that reflection wasn’t thread safe in Scala 2.10.
No miracle here objects have to be read via reflection down to the even the type checking (type checking being something writing via text that is avoided) and then objects have to be populated on flush.