Skip to content

dislimit/data-pipeline-storm

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

95 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Pipeline Guidance (with Apache Storm)

Microsoft patterns & practices

This project focuses on using Apache Storm/Trident with Java. For guidance on using .NET without Storm, see the companion Data Pipeline Guidance.

Overview

The two primary concerns of this project are:

  • Facilitating cold storage of data for later analytics. That is, translating the chatty stream of events into chunky blobs.

  • Demonstrate how to use OpaqueTridentEventHubSpout and Apache Storm/Trident to store Microsoft Azure Eventhub messages to Microsoft Azure Blob exactly-once.

Next Steps

Backlog

  • Performance Result: The performance result will be published once we finishes the performance test.

  • Using Zookeeper to store the state: The current sample stores state in Redis Cache. We plan to replace that with Zookeeper.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

About

Translating an Event Hub stream to chunky blobs using Apache Storm and Trident

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 84.7%
  • C# 15.3%