[PR:#45] To Avoid collecting trainning data to driver and broadcasting them #53

allwefantasy · 2017-09-29T02:37:19Z

I check the last PR in spark-deep-learning is KerasImageFileEstimator, and when i review the code, i find it will collect all trainning data to driver and then broadcast to executors. This means all tranning data should fit in one server memory and it will definitely not work in real world especially when deep learning is a data-hungry ML algrithom.

Maybe we can write tranning data to a distributed message queue eg. Kafka, then we invoke tf queue to recevie data from kafka and consume data from tf queue when tf session starts.

class KerasImageFileEstimator(Estimator, HasInputCol, HasInputImageNodeName,
                              HasOutputCol, HasOutputNodeName, HasLabelCol,
                              HasKerasModel, HasKerasOptimizer, HasKerasLoss,
                              CanLoadImage, HasOutputMode，DistributedModel="ParamsParallel", KafkaServer="127.0.0.1"):

We also can put data in HDFS as optional , but Message Queue sees to be a perfect choice.

allwefantasy changed the title ~~[PR:#45] To Avoid collect tranning data to driver and broadcast them~~ [PR:#45] To Avoid collect trainning data to driver and broadcast them Sep 29, 2017

allwefantasy changed the title ~~[PR:#45] To Avoid collect trainning data to driver and broadcast them~~ [PR:#45] To Avoid collecting trainning data to driver and broadcasting them Sep 29, 2017

allwefantasy mentioned this issue Oct 13, 2017

NLP support #56

Open

phi-dbq added the enhancement label Oct 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PR:#45] To Avoid collecting trainning data to driver and broadcasting them #53

[PR:#45] To Avoid collecting trainning data to driver and broadcasting them #53

allwefantasy commented Sep 29, 2017 •

edited

[PR:#45] To Avoid collecting trainning data to driver and broadcasting them #53

[PR:#45] To Avoid collecting trainning data to driver and broadcasting them #53

Comments

allwefantasy commented Sep 29, 2017 • edited

allwefantasy commented Sep 29, 2017 •

edited