Analyzing patents with Google Dataproc and SynerScope

Author: Jorik Blaas Google Cloud Dataproc is the latest publicly accessible beta product in the Google Cloud Platform portfolio, giving users access to managed Hadoop and Apache Spark for at-scale analytics. In real-life, many datasets are in a format that you cannot easily deal with directly. Patents are a typical example, they mix textual documents […]