GLIBCXX_3.4.9 Could Not Be Found with Apache Spark
If you encounter an error similar to the following, which complains that
GLIBCXX_3.4.9
could not be found, while running an application with Apache Spark you can avoid this by switching Spark's compression method from snappy
to something such aslzf
....
Caused by: java.lang.UnsatisfiedLinkError: .../snappy-1.0.5.3-1e2f59f6-8ea3-4c03-87fe-dcf4fa75ba6c-libsnappyjava.so: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.9' not found (required by.../snappy-1.0.5.3-1e2f59f6-8ea3-4c03-87fe-dcf4fa75ba6c-libsnappyjava.so)
There are a few ways how one can pass configuration options to Spark. The naive way seems to be through command line as,
--conf "spark.io.compression.codec=lzf"
On a side note, you can find what GLIBC versions are available by running
strings /usr/lib/libstdc++.so.6 | grep GLIBC
References
- A mail thread on this
- Spark configuration options
Subscribe to:
Post Comments
(
Atom
)
Great post. I got the exact same problem, but I am using pyspark. Since I don't have sudo permission, I am thinking about work around snappy. Should I just pass the codec configuration in the shell like python count.py --conf "spark.io.compression.codec=lzf"?
ReplyDeleteThanks. This is a parameter that you need to pass to Spark runtime. You can either set it as an environment variable or passing it as given above when starting Spark. See more info at http://spark.apache.org/docs/1.0.1/configuration.html
DeleteThanks, it help me
ReplyDelete