Ray on databricks

Click for: original source

Ray is an open-source project first developed at RISELab that makes it simple to scale any compute-intensive Python workload. With a rich set of libraries and integrations built on a flexible distributed execution framework, Ray brings new use cases and simplifies the development of custom distributed Python functions that would normally be complicated to create. By Stephen Offer.

The article then makes a good job explaining the following:

  • Why need another distributed framework on top of Spark?
  • A simple introduction to Ray architecture
  • Starting Ray on a Databricks cluster
  • Distributing Python UDFs
  • Reinforcement learning

Applications of reinforcement learning broadly consist of scenarios wherever a simulation is able to run, a cost function can be established, and the problem is complicated enough that hard-set logical rules or simpler heuristical models cannot be applied. The most famous cases of reinforcement learning are typically research-orientated with an emphasis on game-play such as AlphaGo, super-human level Atari agents, or simulated autonomous driving, but there are many real-world business use cases. Examples of recent applications are robotic manipulation control for factories, power consumption optimization, and even marketing and advertising recommendations. Good one!

[Read More]

Tags data-science python machine-learning big-data