一区二区日本_久久久久久久国产精品_无码国模国产在线观看_久久99深爱久久99精品_亚洲一区二区三区四区五区午夜_日本在线观看一区二区

MLlib is Apache Spark's scalable machine learning library.

Ease of use

Usable in Java, Scala, Python, and R.

MLlib fits into Spark's APIs and interoperates with NumPy in Python (as of Spark 0.9) and R libraries (as of Spark 1.5). You can use any Hadoop data source (e.g. HDFS, HBase, or local files), making it easy to plug into Hadoop workflows.

data = spark.read.format("libsvm")\
  .load("hdfs://...")

model = KMeans(k=10).fit(data)
Calling MLlib in Python

Performance

High-quality algorithms, 100x faster than MapReduce.

Spark excels at iterative computation, enabling MLlib to run fast. At the same time, we care about algorithmic performance: MLlib contains high-quality algorithms that leverage iteration, and can yield better results than the one-pass approximations sometimes used on MapReduce.

Logistic regression in Hadoop and Spark

Runs everywhere

Spark runs on Hadoop, Apache Mesos, Kubernetes, standalone, or in the cloud, against diverse data sources.

You can run Spark using its standalone cluster mode, on EC2, on Hadoop YARN, on Mesos, or on Kubernetes. Access data in HDFS, Apache Cassandra, Apache HBase, Apache Hive, and hundreds of other data sources.

Algorithms

MLlib contains many algorithms and utilities.

ML algorithms include:

  • Classification: logistic regression, naive Bayes,...
  • Regression: generalized linear regression, survival regression,...
  • Decision trees, random forests, and gradient-boosted trees
  • Recommendation: alternating least squares (ALS)
  • Clustering: K-means, Gaussian mixtures (GMMs),...
  • Topic modeling: latent Dirichlet allocation (LDA)
  • Frequent itemsets, association rules, and sequential pattern mining

ML workflow utilities include:

  • Feature transformations: standardization, normalization, hashing,...
  • ML Pipeline construction
  • Model evaluation and hyper-parameter tuning
  • ML persistence: saving and loading models and Pipelines

Other utilities include:

  • Distributed linear algebra: SVD, PCA,...
  • Statistics: summary statistics, hypothesis testing,...

Refer to the MLlib guide for usage examples.

Community

MLlib is developed as part of the Apache Spark project. It thus gets tested and updated with each Spark release.

If you have questions about the library, ask on the Spark mailing lists.

MLlib is still a rapidly growing project and welcomes contributions. If you'd like to submit an algorithm to MLlib, read how to contribute to Spark and send us a patch!

Getting started

To get started with MLlib:

  • Download Spark. MLlib is included as a module.
  • Read the MLlib guide, which includes various usage examples.
  • Learn how to deploy Spark on a cluster if you'd like to run in distributed mode. You can also run locally on a multicore machine without any setup.
主站蜘蛛池模板: 成人午夜网 | 无毛av| 日本免费一区二区三区视频 | av网站免费 | 2022精品国偷自产免费观看 | 在线免费观看一区二区 | 国产精品18hdxxxⅹ在线 | 日本一区二区三区在线观看 | 中文字幕 欧美 日韩 | 亚洲午夜视频 | 精品美女视频在线观看免费软件 | 久久久.com| 国产在线一区二区三区 | 日本91av视频 | 亚洲激情在线观看 | 久久久精品网站 | 亚洲精品久久久久久国产精华液 | 男女羞羞的网站 | 欧区一欧区二欧区三免费 | 天天爽一爽 | 国户精品久久久久久久久久久不卡 | 亚洲第一在线 | 精品视频一二区 | 中文字幕在线二区 | 亚洲成人三区 | 黄色成人免费在线观看 | 中日字幕大片在线播放 | 国产精品一区二区在线播放 | 亚洲精品在线免费看 | 一区二区三区四区视频 | 国产精品久久久久久吹潮 | 日屁视频 | 亚洲三级av | 亚洲视频免费 | 91精品国产91 | 成人免费在线观看视频 | 伊人欧美视频 | 成人欧美一区二区三区视频xxx | 色性av | 亚洲精品视频导航 | 亚洲欧美日本在线 |