Insights

Capture, Analyze, Act: Our Handy Guide On Monitoring The Twitter Feed Using ELK Stack

Real-Time Twitter Monitoring With ELK

Most businesses, by now, have realized the potential of social media in establishing an incessant connection with their customers and pre-empting their competitors in reaching out to a large audience. However, to leverage social media well, a business needs constant monitoring and faster response time. Twitter is one of the fastest among the leading social media channels. Hence, monitoring every tweet is crucial for a business. But the big question here, is how to get there? We have an answer for you.

We, at Net Solutions, developed a twitter monitoring system using ELK stack from elastic. Have a look.

ELK – The Twitter Monitoring Tool At A Glance

Before we take you through the technical stack we have used for developing this tool, let’s help you understand what it comprises of.

ELK – The Twitter Monitoring Tool At A Glance

ELK stands for Elasticsearch, Logstash and Kibana. The trio combine to run log analysis while delivering the benefits of an open source software.

Elasticsearch: Elasticsearch is the same search and analysis application that forms the backbone of the search feature in various sites today. It is the part of stack where the data is stored, and is responsible for providing all the search and analysis results. It is an easily manageable software with a great rest API and an easy setup.

Logstash: Logstash, works on the front end and ingests the log or twitter stream. It could be parsing unstructured logs into useable fields or could enrich by adding a few fields. The structured and enriched content is sent to Elasticsearch.

Kibana: Kibana allows building of presentable graphs and dashboards to help you understand the data. This saves your efforts in making sense out of the raw data Elasticsearch stores.

We’ll go a bit deeper into these components in the next few paragraphs.

Technical Stack We Used For Creating ELK

How Logstash works: Logstash has shipper and indexer components. Filebeat is a shipper that would actually reside on the client and tail apache logs that it would send to Logstash. This allows the indexer to work with multiple input sources together.

Technical Stack We Used For Creating ELK

The Logstash works in a pipeline manner; ingesting the input in first bucket, transforming and enriching in the second bucket and finally sending output to destinations in the final bucket. Here, Logstash creates a plugin architecture. Based on your stream, content and structure, you can choose from numerous plugins offered by elastic and many other contributors.

Logstash not just ingests apache logs or twitter streams, but also uses sources like http, file, jdbc, AWS cloudwatch metrics, windows event log, and github updates. You can find more about it on elastic’s site.

Similar to input plugins, Logstash also has many filter plugins that would transform and enrich the input stream to make it useable for analysis downstream. These filters are usually powered by regular expressions.

The outputs plugins allow multiple downstream destinations besides Elasticsearch. These include email, file, google cloud storage, hipchat, jira, new relic, AWS S3.

How Elasticsearch Works: Elasticsearch centrally stores the data. It is commended as ELK stack’s fastest growing search software.

Lucene is the underlying technology that Elasticsearch uses for extremely fast data retrieval. Elasticsearch lets users leverage the power of Lucene in a distributed system that is robust. Creating a distributed Elasticsearch setup is usually a matter of configuration settings.

Elasticsearch exposes a REST API that makes it very easy to use. This API provides the controls to the user to work with Lucene that acts as the engine.

If you set out to use Elasticsearch you would need to know its building blocks. See the table to get an initial idea.

ELK 4

We would suggest that you try out the REST API to create events and search stored events. You could use Postman for REST handling. Also, have a look at the Elasticsearch.yml file. Having some familiarity with the API, you can connect the Logstash to send events to the Elasticsearch.

How Kibana Works: Our final component in the stack is Kibana for visualizations.

Its browser-based interface enables us to quickly create and share dynamic dashboards that display changes in real time. You can read more details in documentation provided by elastic. With Kibana you will be able to achieve the following:

  • Define an index pattern
  • Explore the sample data with Discover feature
  • Set up visualizations (pie charts, line graphs) using raw data
  • Assemble visualizations into a Dashboard

How Kibana Works

An interesting feature in Kibana is ‘Timelion’ that is a time series data visualizer. It enables answering questions like:

  • How many pages does each unique user view over time?
  • What’s the difference in traffic volume between this Friday and last Friday?

Setting up ELK stack involves installing the Logstash, Elasticsearch and Kibana components (and an underlying Java environment). In a useful setup, you will also need to install other components like filebeat and X-Pack. After installing the components, you will need to configure the yml files for the respective components.

Install ELK For Twitter Monitoring In Two Simple Steps

Step1: Register The Application

In order to access the Twitter Streaming API, register an application at http://apps.twitter.com and acquire your consumer key and consumer secret and create an access token. These values will be used in the configuration of Logstash.

Install ELK For Twitter Monitoring In Two Simple Steps

Step 2: Configure The Filters In Logstash

As I said earlier, Logstash has three buckets. Accordingly, in order to configure Logstash, we provide input, output and filter elements. The twitter input filter below will enable the twitter-input-plugin to ingest the twitter stream based on configured keywords.

Configure The Filters In Logstash

Step 3: Configure A Watcher

What you could do next, is to configure a Watcher that would send an email to a designated address. If someone posts a comment on Twitter about your organization, you would see it right into your mailbox in real-time. This could be a tweet from your next potential customer who is also looking at your competitors.

Configure A Watcher

Another way it is so useful is that you could look at the tweets from past few weeks or months and create useful analytics.

Benefits of Choosing ELK For Twitter Monitoring

  • Receive email notifications in real-time on reactions to every tweet
  • Customise monitoring by adding filters specific to twitter handles and geographies
  • Analyze activities on your twitter wall through dashboards showing weekly and monthly data.

Summing Up

Brand monitoring is a business analytics process concerned with monitoring various channels on the web or media in order to gain insight about the company, its products, brand, and anything explicitly connected to the business. It is about monitoring the brand’s reputation and reception by the general public and the consumer base and targeted demographic.

ELK trio provides a stack that is easy to setup and connect to twitter. Analyzing the tweets over time and geography or notifying stakeholders real-time of an important twitter activity is just a matter of configuration.

If you are looking for any help on building such or any other digital solution for a better customer or employee engagement, please contact us at [email protected].

Atul Arora

About the Author

Atul Arora is currently working as the Principal Architect at Net Solutions. He has over 18 years of experience working in the JEE stack. He finds immense interest in finance and retail related web applications. He has been credited for developing end to end solutions using JEE stack for enterprise customers. He has worked on several projects across industries like finance, healthcare, education and retail. He loves playing badminton, table tennis, watching movies and spending time with family.

Leave a Comment