SISTRIX GmbH Logo

SISTRIX GmbH

Bonn

SERP Parser for SEO Analytics at SISTRIX

Software Craftsman | DevOp

September 2016 - January 2018
1 year 5 months
Full-time
Product work
Bonn

Relevance

Why this case matters

This framing makes the decision signal explicit: impact, proof, fit, and AI / delivery relevance for hiring or collaboration.

System impact

SERP parser and big data delivery for SEO analytics with 450M+ keywords, 200M seeds, SERP feature analysis, status dashboards, and CI/CD operations.

AI / delivery relevance

AI-native systems craft depends on the same foundations: large data flows, reproducible pipelines, clear ownership, automated quality checks, and operationally reliable platforms.

SISTRIXSERP ParserYou Build It You Run ItSEO Analytics

Proof

450M+

Keywords worldwide

200M

Crawler seeds

SERP Features

Parser product

Especially relevant for

  • For teams with data-intensive SaaS products where pipelines, delivery, and operations belong together.
  • For organizations that need not just big data backend work, but reliable delivery and ownership.

Case context

Overview

SISTRIX needed SEO analytics pipelines that could be built, shipped, and operated by the same team. A central part was the SERP Parser: collect search results across countries and devices, classify them, and make parser status visible instead of treating the system as a hidden batch pipeline.

I worked with a "You Build It, You Run It" model across Jenkins/Docker CI/CD, Spark/Hadoop extraction for 450M+ keywords worldwide, and Apache Mesos/Marathon platform work. The parser made SERP Features such as organics, SEM, shopping, knowledge graph, maps, featured snippets, and sitemaps inspectable across countries and devices.

Responsibility

Activities

  • SERP Parser: Made search results inspectable across countries, devices, and SERP Features
  • Parser status: Play Framework dashboard for weekly runs, success states, throughput, and calc nodes
  • Big Data Pipeline Development: HTML parser with XPath for millions of keywords per country, API integration
  • Data Extraction: HTML crawler with Spark/Hadoop for 200M seeds, structured data extraction
  • DevOps & CI/CD: "You Build It, You Run It" pipeline with Jenkins/Docker, automated deployments
  • PaaS Architecture: Apache Mesos/Marathon, AWS Route 53, scalable infrastructure
  • Quality Assurance: Automated acceptance tests for SaaS tools, Cucumber testing
  • Monitoring & Operations: Status dashboard with Play Framework, operational transparency

Operating mode

Methodology

  • "You Build It, You Run It": End-to-end ownership for parser, pipelines, and operations
  • Operational visibility: Parser status, SERP Features, throughput, and failure states turned runtime behavior into product feedback
  • Big Data Processing: Spark/Hadoop and scalable data pipelines for repeatable search result analysis
  • CI/CD: Automated tests, continuous deployment, and deployment feedback kept delivery tied to runtime behavior

Technical context

Technology stack

The tools are not the point by themselves. What matters is which system layers had to work together.

9Areas
36Technologies

Backend

7
JavaScalaPlay FrameworkAkkaRxJavaGroovyjOOQ

Frontend

2
JavaScriptHTML Parsing

Data & AI

11
SparkHadoopBig Data ProcessingNutchTikaXPathSAXON-HESearch Result ParsingSERP ParserSERP Feature ExtractionData Extraction

DevOps

6
DockerGitApache MesosMarathonContainer OrchestrationStatus Dashboard

CI/CD & Delivery Pipelines

2
JenkinsCI/CD Pipeline

Architecture

1
PaaS Architecture

Databases & Storage

3
MySQLArangoDBData Storage

Tools

2
CucumberMockito

Practices

2
TestingQuality Assurance

Next step

If you want to explore similar leverage for hiring, collaboration, or a concrete transformation, this is the right starting point.

Send a short note about the situation you are trying to assess. I reply personally and will be direct about fit.