<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[~/Blog/Nikhil]]></title><description><![CDATA[~/Blog/Nikhil]]></description><link>https://nikhilmishra.xyz</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1740581872593/84b7710b-23bc-429f-bfff-58dd82c86b01.png</url><title>~/Blog/Nikhil</title><link>https://nikhilmishra.xyz</link></image><generator>RSS for Node</generator><lastBuildDate>Sat, 11 Apr 2026 16:41:20 GMT</lastBuildDate><atom:link href="https://nikhilmishra.xyz/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Code, Vibes and Nostalgia]]></title><description><![CDATA[My Journey with Vibe Coding 🚀
My journey with vibe coding started when I made aushadiai.nikhilmishra.live in a hackathon,I got addicted to it.
Next, I wanted to build games.
So I recreated 2 games that I used to play as a child on a Nokia handset:

...]]></description><link>https://nikhilmishra.xyz/vibecode</link><guid isPermaLink="true">https://nikhilmishra.xyz/vibecode</guid><category><![CDATA[Game Development]]></category><category><![CDATA[vibe coding]]></category><category><![CDATA[Games]]></category><category><![CDATA[development]]></category><category><![CDATA[windsurf]]></category><category><![CDATA[cursor]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Fri, 09 May 2025 09:30:26 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1746782832114/34f71fb2-7791-4bc2-9c5b-2be9c5d015ea.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-my-journey-with-vibe-coding">My Journey with Vibe Coding 🚀</h2>
<p>My journey with vibe coding started when I made <a target="_blank" href="https://aushadiai.nikhilmishra.live">aushadiai.nikhilmishra.live</a> in a hackathon,<br />I got addicted to it.</p>
<p>Next, I wanted to build games.</p>
<p>So I recreated 2 games that I used to play as a child on a Nokia handset:</p>
<ul>
<li><p><a target="_blank" href="https://bounce.nikhilmishra.live">bounce.nikhilmishra.live</a><br />This didn’t turn out as I planned but I tried.</p>
</li>
<li><p><a target="_blank" href="https://isitcricket.nikhilmishra.live">isitcricket.nikhilmishra.live</a><br />This was a success for me, I got praises from a lot of cool people.</p>
</li>
</ul>
<p>My brother wanted some Pokémon cards.</p>
<p>So I vibe coded <a target="_blank" href="https://pokemon.nikhilmishra.live">pokemon.nikhilmishra.live</a>,<br />Here he gets unlimited cards.</p>
<hr />
<h2 id="heading-explore-amp-share">Explore &amp; Share 🎮✨</h2>
<p><strong>Visit all of them and share what you think about them with me!</strong></p>
<p>Every click and comment means a lot.</p>
<hr />
<h3 id="heading-thanks-for-reading">Thanks for Reading!</h3>
<p>If you enjoyed my journey or have any feedback, let me know in the comments below.<br />Your thoughts motivate me to build more fun projects!</p>
<hr />
]]></content:encoded></item><item><title><![CDATA[From First Principles to Production: Automating GitHub Pages Deployments with GitHub Actions]]></title><description><![CDATA[Introduction
In the world of modern software development, automation is not just a convenience but a necessity. Continuous Integration and Continuous Deployment (CI/CD) have become fundamental practices that enable developers to deliver software upda...]]></description><link>https://nikhilmishra.xyz/github-actions-deployment-workflow</link><guid isPermaLink="true">https://nikhilmishra.xyz/github-actions-deployment-workflow</guid><category><![CDATA[GitHub]]></category><category><![CDATA[github-actions]]></category><category><![CDATA[deployment]]></category><category><![CDATA[workflow]]></category><category><![CDATA[ci-cd]]></category><category><![CDATA[Devops]]></category><category><![CDATA[GitHubPages]]></category><category><![CDATA[static site generation]]></category><category><![CDATA[IaC (Infrastructure as Code)]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Fri, 14 Mar 2025 12:01:51 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1741953582609/b2a62917-d988-44cc-a9a4-867362a0d824.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In the world of modern software development, automation is not just a convenience but a necessity. Continuous Integration and Continuous Deployment (CI/CD) have become fundamental practices that enable developers to deliver software updates rapidly and reliably. At the heart of this revolution is GitHub Actions, a powerful automation platform integrated directly into the GitHub ecosystem.</p>
<p>This blog post explores GitHub Actions from first principles, breaking down the concepts and components that make automated deployments possible. Using a real-world example of deploying a static website to GitHub Pages, we'll examine the underlying mechanics and best practices of GitHub Actions workflows.</p>
<h2 id="heading-understanding-cicd-from-first-principles">Understanding CI/CD from First Principles</h2>
<p>Before diving into the technical details, let's understand the fundamental concepts behind CI/CD:</p>
<ol>
<li><strong>Continuous Integration (CI)</strong>: The practice of frequently integrating code changes into a shared repository, followed by automated builds and tests.</li>
<li><strong>Continuous Deployment (CD)</strong>: The practice of automatically deploying every change that passes all test stages to production.</li>
</ol>
<p>At their core, these practices address a fundamental challenge in software development: <strong>how to safely and efficiently move code from development to production</strong>.</p>
<p>The fundamental components of any CI/CD system include:</p>
<ul>
<li><strong>Triggers</strong>: Events that initiate the automation workflow</li>
<li><strong>Runners</strong>: Environments where the automation tasks are executed</li>
<li><strong>Steps</strong>: Individual tasks to be performed</li>
<li><strong>Artifacts</strong>: Files produced during the workflow execution</li>
<li><strong>Environments</strong>: Deployment targets with specific configurations</li>
</ul>
<h2 id="heading-github-actions-architecture-and-components">GitHub Actions: Architecture and Components</h2>
<p>GitHub Actions is built on a simple yet powerful model that follows these first principles. Let's visualize its architecture:</p>
<pre><code class="lang-mermaid">graph TD
    A[Repository] --&gt;|Event Trigger| B[Workflow]
    B --&gt; C[Jobs]
    C --&gt; D[Steps]
    D --&gt; E[Actions]
    E --&gt; F[Outputs]
    E --&gt; G[Artifacts]
    G --&gt; H[Deployment]
</code></pre>
<p>The key components are:</p>
<ol>
<li><strong>Workflows</strong>: YAML files that define the automation process</li>
<li><strong>Events</strong>: Triggers like push, pull request, or scheduled events</li>
<li><strong>Jobs</strong>: Groups of steps that execute on the same runner</li>
<li><strong>Steps</strong>: Individual tasks that run commands or actions</li>
<li><strong>Actions</strong>: Reusable units of code that can be shared and consumed</li>
<li><strong>Runners</strong>: The compute infrastructure where workflows run</li>
</ol>
<h2 id="heading-anatomy-of-a-github-pages-deployment-workflow">Anatomy of a GitHub Pages Deployment Workflow</h2>
<p>Let's examine our example project's workflow file from first principles. The workflow is defined in <code>.github/workflows/deploy.yml</code>:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">name:</span> <span class="hljs-string">Deploy</span> <span class="hljs-string">to</span> <span class="hljs-string">GitHub</span> <span class="hljs-string">Pages</span>

<span class="hljs-attr">on:</span>
  <span class="hljs-attr">push:</span>
    <span class="hljs-attr">branches:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">main</span>

<span class="hljs-attr">permissions:</span>
  <span class="hljs-attr">contents:</span> <span class="hljs-string">read</span>
  <span class="hljs-attr">pages:</span> <span class="hljs-string">write</span>
  <span class="hljs-attr">id-token:</span> <span class="hljs-string">write</span>

<span class="hljs-attr">jobs:</span>
  <span class="hljs-attr">deploy:</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">name:</span> <span class="hljs-string">github-pages</span>
      <span class="hljs-attr">url:</span> <span class="hljs-string">${{</span> <span class="hljs-string">steps.deployment.outputs.page_url</span> <span class="hljs-string">}}</span>
    <span class="hljs-attr">runs-on:</span> <span class="hljs-string">ubuntu-latest</span>
    <span class="hljs-attr">steps:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Checkout</span>
        <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/checkout@v4</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Setup</span> <span class="hljs-string">Pages</span>
        <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/configure-pages@v4</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Upload</span> <span class="hljs-string">artifact</span>
        <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/upload-pages-artifact@v3</span>
        <span class="hljs-attr">with:</span>
          <span class="hljs-attr">path:</span> <span class="hljs-string">'.'</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Deploy</span> <span class="hljs-string">to</span> <span class="hljs-string">GitHub</span> <span class="hljs-string">Pages</span>
        <span class="hljs-attr">id:</span> <span class="hljs-string">deployment</span>
        <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/deploy-pages@v4</span>
</code></pre>
<p>This seemingly simple file encapsulates several fundamental principles:</p>
<h3 id="heading-1-event-driven-architecture">1. <strong>Event-Driven Architecture</strong></h3>
<p>The workflow is triggered by a specific event: a push to the main branch.</p>
<pre><code class="lang-mermaid">graph LR
    A[Developer] --&gt;|Pushes to| B[Main Branch]
    B --&gt;|Triggers| C[Workflow]
</code></pre>
<h3 id="heading-2-declarative-configuration">2. <strong>Declarative Configuration</strong></h3>
<p>The entire workflow is defined declaratively, stating what should happen rather than how. This follows the principle of <strong>infrastructure as code</strong>.</p>
<h3 id="heading-3-security-first-design">3. <strong>Security-First Design</strong></h3>
<p>The <code>permissions</code> section explicitly defines the minimal set of permissions needed, following the principle of least privilege:</p>
<pre><code class="lang-mermaid">flowchart TD
    A[Workflow Permissions] --&gt; B[contents: read]
    A --&gt; C[pages: write]
    A --&gt; D[id-token: write]

    B --&gt; E[Read repository content]
    C --&gt; F[Modify GitHub Pages]
    D --&gt; G[Create &amp; use OIDC tokens]
</code></pre>
<h3 id="heading-4-sequential-pipeline-architecture">4. <strong>Sequential Pipeline Architecture</strong></h3>
<p>The steps form a sequential pipeline, where each step builds on the previous one:</p>
<pre><code class="lang-mermaid">sequenceDiagram
    participant R as Repository
    participant W as Workflow
    participant GH as GitHub Pages

    W-&gt;&gt;R: Checkout code
    W-&gt;&gt;W: Setup Pages configuration
    W-&gt;&gt;W: Create artifact
    W-&gt;&gt;GH: Deploy artifact
    GH-&gt;&gt;GH: Publish website
</code></pre>
<h2 id="heading-breaking-down-the-workflow-step-by-step-analysis">Breaking Down the Workflow: Step-by-Step Analysis</h2>
<p>Let's examine each step of the workflow from first principles:</p>
<h3 id="heading-1-checkout">1. Checkout</h3>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Checkout</span>
  <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/checkout@v4</span>
</code></pre>
<p>This step fetches the repository content. It's fundamental because:</p>
<ul>
<li>It provides the workflow with access to the source code</li>
<li>It allows the workflow to operate on the latest version of the code</li>
<li>Without it, the workflow would have nothing to deploy</li>
</ul>
<h3 id="heading-2-setup-pages">2. Setup Pages</h3>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Setup</span> <span class="hljs-string">Pages</span>
  <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/configure-pages@v4</span>
</code></pre>
<p>This step configures the GitHub Pages environment. It's necessary because:</p>
<ul>
<li>It initializes the required environment variables</li>
<li>It sets up the underlying GitHub Pages infrastructure</li>
<li>It prepares the system for artifact upload and deployment</li>
</ul>
<h3 id="heading-3-upload-artifact">3. Upload Artifact</h3>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Upload</span> <span class="hljs-string">artifact</span>
  <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/upload-pages-artifact@v3</span>
  <span class="hljs-attr">with:</span>
    <span class="hljs-attr">path:</span> <span class="hljs-string">'.'</span>
</code></pre>
<p>This step packages the site content. It follows the principle of <strong>immutable artifacts</strong>:</p>
<ul>
<li>It creates a snapshot of the site at a specific point in time</li>
<li>It provides a consistent package that can be deployed</li>
<li>It enables potential rollbacks by preserving the artifact</li>
</ul>
<h3 id="heading-4-deploy">4. Deploy</h3>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Deploy</span> <span class="hljs-string">to</span> <span class="hljs-string">GitHub</span> <span class="hljs-string">Pages</span>
  <span class="hljs-attr">id:</span> <span class="hljs-string">deployment</span>
  <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/deploy-pages@v4</span>
</code></pre>
<p>The final step actually publishes the site. It embodies the principle of <strong>separation of concerns</strong>:</p>
<ul>
<li>Building the artifact and deploying it are separate operations</li>
<li>This allows for different permissions and controls at each stage</li>
<li>It supports a more secure deployment pipeline</li>
</ul>
<h2 id="heading-the-principle-of-least-privilege-in-action">The Principle of Least Privilege in Action</h2>
<p>One of the most important security principles is providing only the minimum permissions necessary. Our workflow demonstrates this with:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">permissions:</span>
  <span class="hljs-attr">contents:</span> <span class="hljs-string">read</span>
  <span class="hljs-attr">pages:</span> <span class="hljs-string">write</span>
  <span class="hljs-attr">id-token:</span> <span class="hljs-string">write</span>
</code></pre>
<p>This explicit permission model:</p>
<ul>
<li>Prevents the workflow from modifying repository content</li>
<li>Allows writing only to GitHub Pages</li>
<li>Provides identity token access for secure deployments</li>
</ul>
<p>Let's visualize how these permissions integrate with the workflow:</p>
<pre><code class="lang-mermaid">graph TD
    subgraph "Repository Boundary"
        A[Repository Content] --- B[Read-Only Access]
    end

    subgraph "GitHub Pages Boundary"
        C[GitHub Pages] --- D[Write Access]
    end

    subgraph "Authentication"
        E[OIDC Tokens] --- F[Write Access]
    end

    B --&gt; G[Checkout Step]
    D --&gt; H[Deploy Step]
    F --&gt; I[Authentication for Deployment]
</code></pre>
<h2 id="heading-environment-based-deployment">Environment-Based Deployment</h2>
<p>The workflow uses a specific environment for deployment:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">environment:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">github-pages</span>
  <span class="hljs-attr">url:</span> <span class="hljs-string">${{</span> <span class="hljs-string">steps.deployment.outputs.page_url</span> <span class="hljs-string">}}</span>
</code></pre>
<p>This follows the principle of <strong>environment segregation</strong>:</p>
<ul>
<li>Deployment targets are explicitly defined</li>
<li>Each environment can have its own protection rules</li>
<li>Deployment URLs are tracked and linked to the workflow</li>
</ul>
<h2 id="heading-first-principles-of-static-site-deployment">First Principles of Static Site Deployment</h2>
<p>Our example deploys a simple static HTML site:</p>
<pre><code class="lang-html"><span class="hljs-meta">&lt;!DOCTYPE <span class="hljs-meta-keyword">html</span>&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">html</span> <span class="hljs-attr">lang</span>=<span class="hljs-string">"en"</span>&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">head</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">meta</span> <span class="hljs-attr">charset</span>=<span class="hljs-string">"UTF-8"</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">meta</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"viewport"</span> <span class="hljs-attr">content</span>=<span class="hljs-string">"width=device-width, initial-scale=1.0"</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">title</span>&gt;</span>My GitHub Pages Site<span class="hljs-tag">&lt;/<span class="hljs-name">title</span>&gt;</span>
    <span class="hljs-comment">&lt;!-- CSS styling omitted for brevity --&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">head</span>&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">body</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"container"</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">h1</span>&gt;</span>Hello, GitHub Actions!<span class="hljs-tag">&lt;/<span class="hljs-name">h1</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>This site is deployed using GitHub Actions and GitHub Pages.<span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>Last updated: February 4, 2025 at 10:10 AM IST<span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>
    <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">body</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">html</span>&gt;</span>
</code></pre>
<p>From first principles, deployment of static content requires:</p>
<ol>
<li><strong>Content Storage</strong>: A place to store the HTML, CSS, and JavaScript files</li>
<li><strong>Content Serving</strong>: A web server to deliver the files to browsers</li>
<li><strong>Content Delivery</strong>: A way to efficiently distribute the content</li>
</ol>
<p>GitHub Pages handles all three aspects:</p>
<ul>
<li>It stores the content in GitHub's infrastructure</li>
<li>It serves the content via GitHub's servers</li>
<li>It delivers the content via GitHub's CDN</li>
</ul>
<h2 id="heading-the-complete-deployment-flow">The Complete Deployment Flow</h2>
<p>Let's visualize the entire process from code push to website delivery:</p>
<pre><code class="lang-mermaid">graph TB
    A[Developer] --&gt;|1. Push to main| B[GitHub Repository]
    B --&gt;|2. Trigger workflow| C[GitHub Actions]

    subgraph "GitHub Actions Workflow"
        C --&gt;|3. Checkout code| D[Runner]
        D --&gt;|4. Setup Pages| E[Configure Environment]
        E --&gt;|5. Create artifact| F[Artifact]
        F --&gt;|6. Deploy| G[GitHub Pages Service]
    end

    G --&gt;|7. Publish| H[Live Website]
    H --&gt;|8. Serve content| I[End Users]

    classDef blue fill:#2b88d8,stroke:#000,stroke-width:1px,color:white;
    classDef green fill:#25a767,stroke:#000,stroke-width:1px,color:white;
    classDef orange fill:#ff9900,stroke:#000,stroke-width:1px,color:white;

    class A,B blue
    class C,D,E,F green
    class G,H,I orange
</code></pre>
<h2 id="heading-implementation-considerations">Implementation Considerations</h2>
<p>When implementing GitHub Actions workflows from first principles, consider:</p>
<h3 id="heading-1-workflow-isolation">1. Workflow Isolation</h3>
<p>Each workflow should have a single responsibility. For our static site, deployment is the sole responsibility. For more complex applications, you might have separate workflows for:</p>
<ul>
<li>Building and testing</li>
<li>Security scanning</li>
<li>Deployment to staging</li>
<li>Deployment to production</li>
</ul>
<h3 id="heading-2-artifact-immutability">2. Artifact Immutability</h3>
<p>Once created, artifacts should not be modified. This ensures consistency across environments and enables reliable rollbacks.</p>
<h3 id="heading-3-idempotent-deployments">3. Idempotent Deployments</h3>
<p>Deployments should be idempotent - running the same deployment multiple times should result in the same final state.</p>
<h3 id="heading-4-failure-handling">4. Failure Handling</h3>
<p>Workflows should fail fast and provide clear error messages. This reduces troubleshooting time and improves developer experience.</p>
<h2 id="heading-extending-the-workflow">Extending the Workflow</h2>
<p>From first principles, we can extend our basic workflow to include additional steps:</p>
<h3 id="heading-1-testing">1. Testing</h3>
<p>Add automated tests to verify site functionality:</p>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Test</span>
  <span class="hljs-attr">run:</span> <span class="hljs-string">|
    npm install -g htmlhint
    htmlhint index.html</span>
</code></pre>
<h3 id="heading-2-performance-optimization">2. Performance Optimization</h3>
<p>Optimize assets before deployment:</p>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Optimize</span>
  <span class="hljs-attr">run:</span> <span class="hljs-string">|
    npm install -g html-minifier
    html-minifier --collapse-whitespace index.html -o index.html</span>
</code></pre>
<h3 id="heading-3-security-scanning">3. Security Scanning</h3>
<p>Add security checks to prevent vulnerable code from being deployed:</p>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Security</span> <span class="hljs-string">scan</span>
  <span class="hljs-attr">uses:</span> <span class="hljs-string">aquasecurity/trivy-action@master</span>
  <span class="hljs-attr">with:</span>
    <span class="hljs-attr">scan-type:</span> <span class="hljs-string">'fs'</span>
    <span class="hljs-attr">format:</span> <span class="hljs-string">'table'</span>
</code></pre>
<h2 id="heading-conclusion-from-principles-to-practice">Conclusion: From Principles to Practice</h2>
<p>By understanding GitHub Actions from first principles, we gain insights beyond simply following tutorials:</p>
<ol>
<li><p><strong>We understand why each component exists</strong>: Each part of the workflow serves a specific purpose in the deployment pipeline.</p>
</li>
<li><p><strong>We can troubleshoot effectively</strong>: Knowledge of the underlying principles helps identify and fix issues when they arise.</p>
</li>
<li><p><strong>We can extend and customize</strong>: Instead of blindly copying examples, we can adapt workflows to our specific needs.</p>
</li>
<li><p><strong>We make better security decisions</strong>: Understanding the permission model allows us to implement the principle of least privilege.</p>
</li>
</ol>
<p>GitHub Actions workflows embody fundamental software engineering principles: separation of concerns, infrastructure as code, principle of least privilege, and immutable artifacts. By applying these principles to our deployments, we create robust, secure, and maintainable automation pipelines.</p>
<p>The next time you set up a GitHub Actions workflow, consider the principles behind each configuration option. This first-principles approach will lead to more thoughtful and effective automation strategies.</p>
<hr />
<p><em>Would you like to learn more about CI/CD principles or explore more advanced GitHub Actions workflows? Let me know in the comments below!</em></p>
]]></content:encoded></item><item><title><![CDATA[Building a Cloud-Native DevOps Pipeline from First Principles]]></title><description><![CDATA[Introduction
In today's rapidly evolving technological landscape, understanding how to build and deploy applications using cloud-native methodologies is essential for any software engineer. This blog post details the implementation of a complete DevO...]]></description><link>https://nikhilmishra.xyz/gitops-majorproject</link><guid isPermaLink="true">https://nikhilmishra.xyz/gitops-majorproject</guid><category><![CDATA[Devops]]></category><category><![CDATA[cloud native]]></category><category><![CDATA[cicd]]></category><category><![CDATA[gitops]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[Terraform]]></category><category><![CDATA[AWS]]></category><category><![CDATA[GitHub Actions]]></category><category><![CDATA[#IaC]]></category><category><![CDATA[containerization]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Sun, 09 Mar 2025 12:12:43 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1741522249065/762323d6-e320-4d67-b3fc-13c0e0ce78de.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In today's rapidly evolving technological landscape, understanding how to build and deploy applications using cloud-native methodologies is essential for any software engineer. This blog post details the implementation of a complete DevOps pipeline for the vProfile application, a multi-tier Java web application, using Infrastructure as Code (IaC), containerization, Kubernetes orchestration, and CI/CD practices.</p>
<p>This project applies modern DevOps practices from first principles, creating a robust, scalable, and automated deployment pipeline on AWS cloud infrastructure. We'll explore each component of the system, understand the underlying principles, and see how they work together to create a seamless deployment experience.</p>
<h2 id="heading-project-architecture-overview">Project Architecture Overview</h2>
<p>The vProfile project utilizes a microservices architecture deployed on AWS using Kubernetes. Before diving into the implementation details, let's understand the high-level architecture:</p>
<pre><code class="lang-mermaid">flowchart TD
    subgraph "CI/CD Pipeline"
        GH[GitHub Repositories]
        GA[GitHub Actions]
        TS[Test &amp; SonarQube]
        DB[Docker Build]
        ECR[Amazon ECR]
    end

    subgraph "AWS Infrastructure"
        TF[Terraform]
        VPC[AWS VPC]
        EKS[Amazon EKS]
        S3[S3 State Backend]
    end

    subgraph "Kubernetes Deployment"
        HC[Helm Charts]
        ING[NGINX Ingress]
        APP[Vprofile App]
        DB2[MySQL]
        MC[Memcached]
        RMQ[RabbitMQ]
    end

    GH --&gt; GA
    GA --&gt; TS
    TS --&gt; DB
    DB --&gt; ECR

    GH --&gt; TF
    TF --&gt; VPC
    TF --&gt; EKS
    TF --&gt; S3
    TF --&gt; ING

    ECR --&gt; HC
    HC --&gt; APP
    HC --&gt; DB2
    HC --&gt; MC
    HC --&gt; RMQ

    ING --&gt; APP
</code></pre>
<p>The project is divided into two main repositories:</p>
<ol>
<li><strong>iac-vprofile</strong>: Responsible for infrastructure provisioning using Terraform</li>
<li><strong>vprofile-action</strong>: Contains the application code and deployment configurations</li>
</ol>
<p>This separation of concerns ensures that infrastructure and application code can evolve independently while maintaining a cohesive deployment strategy.</p>
<h2 id="heading-first-principles-understanding-the-core-concepts">First Principles: Understanding the Core Concepts</h2>
<h3 id="heading-infrastructure-as-code-iac">Infrastructure as Code (IaC)</h3>
<p>At its core, Infrastructure as Code is about managing infrastructure through machine-readable definition files rather than manual processes. This approach offers several key benefits:</p>
<ol>
<li><strong>Reproducibility</strong>: Infrastructure can be consistently reproduced across different environments</li>
<li><strong>Version Control</strong>: Infrastructure changes can be tracked, reviewed, and rolled back</li>
<li><strong>Automation</strong>: Reduces manual errors and increases deployment speed</li>
<li><strong>Documentation</strong>: The code itself documents the infrastructure</li>
</ol>
<p>In our project, we use Terraform to define our AWS infrastructure, including VPC, subnets, and the EKS cluster.</p>
<h3 id="heading-containerization">Containerization</h3>
<p>Containers encapsulate an application and its dependencies into a self-contained unit that can run anywhere. The key principles include:</p>
<ol>
<li><strong>Isolation</strong>: Applications run in isolated environments</li>
<li><strong>Portability</strong>: Containers run consistently across different environments</li>
<li><strong>Efficiency</strong>: Lighter weight than virtual machines</li>
<li><strong>Scalability</strong>: Containers can be easily scaled horizontally</li>
</ol>
<p>We use Docker to containerize our vProfile application, creating a multi-stage build process that optimizes the final image size.</p>
<h3 id="heading-orchestration">Orchestration</h3>
<p>Container orchestration automates the deployment, scaling, and management of containerized applications. Core principles include:</p>
<ol>
<li><strong>Service Discovery</strong>: Containers can find and communicate with each other</li>
<li><strong>Load Balancing</strong>: Traffic is distributed across containers</li>
<li><strong>Self-healing</strong>: Failed containers are automatically replaced</li>
<li><strong>Scaling</strong>: Applications can scale up or down based on demand</li>
</ol>
<p>Amazon EKS (Elastic Kubernetes Service) serves as our orchestration platform, providing a managed Kubernetes environment.</p>
<h3 id="heading-continuous-integration-and-continuous-deployment-cicd">Continuous Integration and Continuous Deployment (CI/CD)</h3>
<p>CI/CD bridges the gap between development and operations by automating the building, testing, and deployment processes:</p>
<ol>
<li><strong>Continuous Integration</strong>: Code changes are regularly built and tested</li>
<li><strong>Continuous Delivery</strong>: Code is always in a deployable state</li>
<li><strong>Continuous Deployment</strong>: Code changes are automatically deployed to production</li>
<li><strong>Feedback Loops</strong>: Developers get quick feedback on changes</li>
</ol>
<p>GitHub Actions powers our CI/CD pipeline, automating everything from code testing to Kubernetes deployment.</p>
<h2 id="heading-implementing-the-infrastructure-with-terraform">Implementing the Infrastructure with Terraform</h2>
<h3 id="heading-vpc-configuration">VPC Configuration</h3>
<p>The foundation of our AWS infrastructure is a well-architected Virtual Private Cloud (VPC):</p>
<pre><code class="lang-mermaid">flowchart TD
    subgraph "AWS Region us-east-2"
        VPC["VPC: 172.20.0.0/16"]

        subgraph "Availability Zones"
            AZ1["AZ 1"]
            AZ2["AZ 2"]
            AZ3["AZ 3"]
        end

        subgraph "Private Subnets"
            PS1["172.20.1.0/24"]
            PS2["172.20.2.0/24"]
            PS3["172.20.3.0/24"]
        end

        subgraph "Public Subnets"
            PUS1["172.20.4.0/24"]
            PUS2["172.20.5.0/24"]
            PUS3["172.20.6.0/24"]
        end

        NG["NAT Gateway"]
        IGW["Internet Gateway"]
    end

    VPC --&gt; AZ1
    VPC --&gt; AZ2
    VPC --&gt; AZ3

    AZ1 --&gt; PS1
    AZ2 --&gt; PS2
    AZ3 --&gt; PS3

    AZ1 --&gt; PUS1
    AZ2 --&gt; PUS2
    AZ3 --&gt; PUS3

    PUS1 --&gt; IGW
    PUS2 --&gt; IGW
    PUS3 --&gt; IGW

    PS1 --&gt; NG
    PS2 --&gt; NG
    PS3 --&gt; NG

    NG --&gt; IGW
</code></pre>
<p>Our Terraform configuration creates a VPC with a CIDR block of 172.20.0.0/16, spanning three availability zones for high availability. It includes:</p>
<ul>
<li>Three private subnets for EKS worker nodes</li>
<li>Three public subnets for the load balancer</li>
<li>NAT gateway for outbound internet access from private subnets</li>
<li>Appropriate tags for Kubernetes integration</li>
</ul>
<p>Here's a key excerpt from our <code>vpc.tf</code>:</p>
<pre><code class="lang-hcl">module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.1.2"

  name = "vprofile-eks"
  cidr = "172.20.0.0/16"
  azs  = slice(data.aws_availability_zones.available.names, 0, 3)

  private_subnets = ["172.20.1.0/24", "172.20.2.0/24", "172.20.3.0/24"]
  public_subnets  = ["172.20.4.0/24", "172.20.5.0/24", "172.20.6.0/24"]

  enable_nat_gateway   = true
  single_nat_gateway   = true
  enable_dns_hostnames = true

  public_subnet_tags = {
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/elb"                      = 1
  }

  private_subnet_tags = {
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/internal-elb"             = 1
  }
}
</code></pre>
<h3 id="heading-eks-cluster-configuration">EKS Cluster Configuration</h3>
<p>Amazon EKS provides a managed Kubernetes control plane, while our worker nodes run in the private subnets of our VPC:</p>
<pre><code class="lang-mermaid">flowchart TD
    subgraph "Amazon EKS"
        CP["Control Plane"]

        subgraph "Node Group 1"
            NG1N1["t3.small"]
            NG1N2["t3.small"]
        end

        subgraph "Node Group 2"
            NG2N1["t3.small"]
        end
    end

    subgraph "VPC"
        PS["Private Subnets"]
    end

    subgraph "Autoscaling"
        ASG["ASG Config:
        Group 1: 1-3 nodes
        Group 2: 1-2 nodes"]
    end

    CP --&gt; NG1N1
    CP --&gt; NG1N2
    CP --&gt; NG2N1

    NG1N1 --&gt; PS
    NG1N2 --&gt; PS
    NG2N1 --&gt; PS

    ASG --&gt; NG1N1
    ASG --&gt; NG1N2
    ASG --&gt; NG2N1
</code></pre>
<p>Our EKS configuration creates a cluster with version 1.27 and two managed node groups running on t3.small instances:</p>
<pre><code class="lang-hcl">module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "19.19.1"

  cluster_name    = local.cluster_name
  cluster_version = "1.27"

  vpc_id                         = module.vpc.vpc_id
  subnet_ids                     = module.vpc.private_subnets
  cluster_endpoint_public_access = true

  eks_managed_node_group_defaults = {
    ami_type = "AL2_x86_64"
  }

  eks_managed_node_groups = {
    one = {
      name = "node-group-1"
      instance_types = ["t3.small"]
      min_size     = 1
      max_size     = 3
      desired_size = 2
    }

    two = {
      name = "node-group-2"
      instance_types = ["t3.small"]
      min_size     = 1
      max_size     = 2
      desired_size = 1
    }
  }
}
</code></pre>
<h3 id="heading-terraform-workflow-automation">Terraform Workflow Automation</h3>
<p>We use GitHub Actions to automate the Terraform workflow, ensuring consistent infrastructure deployments:</p>
<pre><code class="lang-mermaid">sequenceDiagram
    actor Developer
    participant GitHub as GitHub Repository
    participant Actions as GitHub Actions
    participant S3 as S3 Backend
    participant AWS as AWS Services

    Developer-&gt;&gt;GitHub: Push code to main branch
    GitHub-&gt;&gt;Actions: Trigger workflow

    Actions-&gt;&gt;Actions: terraform init
    Actions-&gt;&gt;S3: Retrieve state
    S3-&gt;&gt;Actions: Return state

    Actions-&gt;&gt;Actions: terraform fmt check
    Actions-&gt;&gt;Actions: terraform validate
    Actions-&gt;&gt;Actions: terraform plan

    alt if main branch
        Actions-&gt;&gt;Actions: terraform apply
        Actions-&gt;&gt;AWS: Create/update resources
        AWS--&gt;&gt;Actions: Resources created/updated

        Actions-&gt;&gt;AWS: Configure kubectl
        Actions-&gt;&gt;AWS: Install NGINX Ingress
    end
</code></pre>
<p>The workflow includes:</p>
<ol>
<li>Terraform initialization with S3 backend</li>
<li>Format checking and validation</li>
<li>Planning the infrastructure changes</li>
<li>Applying changes only on the main branch</li>
<li>Configuring kubectl and installing the NGINX ingress controller</li>
</ol>
<h2 id="heading-application-architecture-and-containerization">Application Architecture and Containerization</h2>
<h3 id="heading-vprofile-application-components">vProfile Application Components</h3>
<p>The vProfile application consists of multiple microservices:</p>
<pre><code class="lang-mermaid">flowchart TD
    subgraph "vProfile Application"
        WA["Web Application (Tomcat)"]
        DB["MySQL Database"]
        MC["Memcached"]
        RMQ["RabbitMQ"]
    end

    User["User"] --&gt; WA
    WA --&gt; DB
    WA --&gt; MC
    WA --&gt; RMQ
</code></pre>
<p>Each component is containerized and deployed as a separate service in Kubernetes.</p>
<h3 id="heading-multi-stage-docker-build">Multi-stage Docker Build</h3>
<p>We use a multi-stage Docker build to optimize our application container:</p>
<pre><code class="lang-mermaid">flowchart LR
    subgraph "Build Stage"
        JDK["OpenJDK 11"]
        MVN["Maven Build"]
        WAR["vprofile-v2.war"]
    end

    subgraph "Final Stage"
        TC["Tomcat 9"]
        DEPLOY["Deploy WAR"]
    end

    JDK --&gt; MVN
    MVN --&gt; WAR
    WAR --&gt; DEPLOY
    TC --&gt; DEPLOY
</code></pre>
<p>The Dockerfile efficiently builds the application and creates a lean production image:</p>
<pre><code class="lang-dockerfile"><span class="hljs-keyword">FROM</span> openjdk:<span class="hljs-number">11</span> AS BUILD_IMAGE
<span class="hljs-keyword">RUN</span><span class="bash"> apt update &amp;&amp; apt install maven -y</span>
<span class="hljs-keyword">COPY</span><span class="bash"> ./ vprofile-project</span>
<span class="hljs-keyword">RUN</span><span class="bash"> <span class="hljs-built_in">cd</span> vprofile-project &amp;&amp;  mvn install </span>

<span class="hljs-keyword">FROM</span> tomcat:<span class="hljs-number">9</span>-jre11
<span class="hljs-keyword">LABEL</span><span class="bash"> <span class="hljs-string">"Project"</span>=<span class="hljs-string">"Vprofile"</span></span>
<span class="hljs-keyword">LABEL</span><span class="bash"> <span class="hljs-string">"Author"</span>=<span class="hljs-string">"Imran"</span></span>
<span class="hljs-keyword">RUN</span><span class="bash"> rm -rf /usr/<span class="hljs-built_in">local</span>/tomcat/webapps/*</span>
<span class="hljs-keyword">COPY</span><span class="bash"> --from=BUILD_IMAGE vprofile-project/target/vprofile-v2.war /usr/<span class="hljs-built_in">local</span>/tomcat/webapps/ROOT.war</span>

<span class="hljs-keyword">EXPOSE</span> <span class="hljs-number">8080</span>
<span class="hljs-keyword">CMD</span><span class="bash"> [<span class="hljs-string">"catalina.sh"</span>, <span class="hljs-string">"run"</span>]</span>
</code></pre>
<h2 id="heading-kubernetes-deployment-with-helm">Kubernetes Deployment with Helm</h2>
<h3 id="heading-helm-charts-structure">Helm Charts Structure</h3>
<p>Helm is used to template and parameterize our Kubernetes manifests:</p>
<pre><code class="lang-mermaid">flowchart TD
    subgraph "Helm Chart Structure"
        CH["vprofilecharts/"]
        VAL["values.yaml"]
        TPL["templates/"]

        subgraph "Templates"
            APP["vproappdep.yml"]
            SVC["Service definitions"]
            ING["vproingress.yaml"]
            DB["Database templates"]
            MC["Memcached templates"]
            RMQ["RabbitMQ templates"]
        end
    end

    CH --&gt; VAL
    CH --&gt; TPL
    TPL --&gt; APP
    TPL --&gt; SVC
    TPL --&gt; ING
    TPL --&gt; DB
    TPL --&gt; MC
    TPL --&gt; RMQ
</code></pre>
<h3 id="heading-application-deployment">Application Deployment</h3>
<p>The application deployment includes initialization containers to ensure dependencies are available:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">apps/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Deployment</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">vproapp</span>
  <span class="hljs-attr">labels:</span> 
    <span class="hljs-attr">app:</span> <span class="hljs-string">vproapp</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">replicas:</span> <span class="hljs-number">1</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">matchLabels:</span>
      <span class="hljs-attr">app:</span> <span class="hljs-string">vproapp</span>
  <span class="hljs-attr">template:</span>
    <span class="hljs-attr">metadata:</span>
      <span class="hljs-attr">labels:</span>
        <span class="hljs-attr">app:</span> <span class="hljs-string">vproapp</span>
    <span class="hljs-attr">spec:</span>
      <span class="hljs-attr">containers:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">vproapp</span>
        <span class="hljs-attr">image:</span> {{ <span class="hljs-string">.Values.appimage</span>}}<span class="hljs-string">:{{</span> <span class="hljs-string">.Values.apptag}}</span>
        <span class="hljs-attr">ports:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">vproapp-port</span>
          <span class="hljs-attr">containerPort:</span> <span class="hljs-number">8080</span>
      <span class="hljs-attr">initContainers:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">init-mydb</span>
        <span class="hljs-attr">image:</span> <span class="hljs-string">busybox</span>
        <span class="hljs-attr">command:</span> [<span class="hljs-string">'sh'</span>, <span class="hljs-string">'-c'</span>, <span class="hljs-string">'until nslookup vprodb.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for mydb; sleep 2; done;'</span>]
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">init-memcache</span>
        <span class="hljs-attr">image:</span> <span class="hljs-string">busybox</span>
        <span class="hljs-attr">command:</span> [<span class="hljs-string">'sh'</span>, <span class="hljs-string">'-c'</span>, <span class="hljs-string">'until nslookup vprocache01.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for mydb; sleep 2; done;'</span>]
</code></pre>
<h3 id="heading-ingress-configuration">Ingress Configuration</h3>
<p>The NGINX ingress controller routes external traffic to our application:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">networking.k8s.io/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Ingress</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">vpro-ingress</span>
  <span class="hljs-attr">annotations:</span>
    <span class="hljs-attr">nginx.ingress.kubernetes.io/use-regex:</span> <span class="hljs-string">"true"</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">ingressClassName:</span> <span class="hljs-string">nginx</span>
  <span class="hljs-attr">rules:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">host:</span> <span class="hljs-string">majorproject.nikhilmishra.live</span>
    <span class="hljs-attr">http:</span>
      <span class="hljs-attr">paths:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">path:</span> <span class="hljs-string">/</span>
        <span class="hljs-attr">pathType:</span> <span class="hljs-string">Prefix</span>
        <span class="hljs-attr">backend:</span>
          <span class="hljs-attr">service:</span>
            <span class="hljs-attr">name:</span> <span class="hljs-string">my-app</span>
            <span class="hljs-attr">port:</span>
              <span class="hljs-attr">number:</span> <span class="hljs-number">8080</span>
</code></pre>
<h2 id="heading-cicd-pipeline-with-github-actions">CI/CD Pipeline with GitHub Actions</h2>
<h3 id="heading-application-cicd-workflow">Application CI/CD Workflow</h3>
<p>Our GitHub Actions workflow for the application pipeline includes testing, building, and deploying:</p>
<pre><code class="lang-mermaid">sequenceDiagram
    actor Developer
    participant GitHub as GitHub
    participant GHCI as GitHub Actions
    participant Sonar as SonarQube
    participant Docker as Docker Build
    participant ECR as Amazon ECR
    participant EKS as Amazon EKS

    Developer-&gt;&gt;GitHub: Push code
    GitHub-&gt;&gt;GHCI: Trigger workflow

    GHCI-&gt;&gt;GHCI: Maven test
    GHCI-&gt;&gt;GHCI: Checkstyle
    GHCI-&gt;&gt;Sonar: SonarQube scan

    GHCI-&gt;&gt;Docker: Build image
    Docker-&gt;&gt;ECR: Push image

    GHCI-&gt;&gt;EKS: Configure kubectl
    GHCI-&gt;&gt;EKS: Create Docker registry secret
    GHCI-&gt;&gt;EKS: Deploy with Helm
</code></pre>
<p>The workflow includes:</p>
<ol>
<li><p><strong>Testing Phase</strong>:</p>
<ul>
<li>Maven tests</li>
<li>Code style checks</li>
<li>SonarQube analysis for code quality</li>
</ul>
</li>
<li><p><strong>Build and Publish Phase</strong>:</p>
<ul>
<li>Docker image building</li>
<li>Push to Amazon ECR</li>
</ul>
</li>
<li><p><strong>Deployment Phase</strong>:</p>
<ul>
<li>Configure kubectl</li>
<li>Create registry credentials</li>
<li>Deploy using Helm</li>
</ul>
</li>
</ol>
<p>Here's a key excerpt from the workflow file:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">name:</span> <span class="hljs-string">vprofile</span> <span class="hljs-string">actions</span>
<span class="hljs-attr">on:</span> <span class="hljs-string">workflow_dispatch</span>
<span class="hljs-attr">env:</span>
  <span class="hljs-attr">AWS_REGION:</span> <span class="hljs-string">us-east-2</span>
  <span class="hljs-attr">ECR_REPOSITORY:</span> <span class="hljs-string">vprofileapp</span>
  <span class="hljs-attr">EKS_CLUSTER:</span> <span class="hljs-string">vprofile-eks</span>

<span class="hljs-attr">jobs:</span>
  <span class="hljs-attr">Testing:</span>
    <span class="hljs-attr">runs-on:</span> <span class="hljs-string">ubuntu-latest</span>
    <span class="hljs-attr">steps:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Code</span> <span class="hljs-string">checkout</span>
        <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/checkout@v4</span>

      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Maven</span> <span class="hljs-string">test</span>
        <span class="hljs-attr">run:</span> <span class="hljs-string">mvn</span> <span class="hljs-string">test</span>

      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Checkstyle</span>
        <span class="hljs-attr">run:</span> <span class="hljs-string">mvn</span> <span class="hljs-string">checkstyle:checkstyle</span>

      <span class="hljs-comment"># More testing steps...</span>

  <span class="hljs-attr">BUILD_AND_PUBLISH:</span>   
    <span class="hljs-attr">needs:</span> <span class="hljs-string">Testing</span>
    <span class="hljs-attr">runs-on:</span> <span class="hljs-string">ubuntu-latest</span>
    <span class="hljs-attr">steps:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Code</span> <span class="hljs-string">checkout</span>
        <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/checkout@v4</span>

      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Build</span> <span class="hljs-string">&amp;</span> <span class="hljs-string">Upload</span> <span class="hljs-string">image</span> <span class="hljs-string">to</span> <span class="hljs-string">ECR</span>
        <span class="hljs-attr">uses:</span> <span class="hljs-string">appleboy/docker-ecr-action@master</span>
        <span class="hljs-attr">with:</span>
         <span class="hljs-attr">access_key:</span> <span class="hljs-string">${{</span> <span class="hljs-string">secrets.AWS_ACCESS_KEY_ID</span> <span class="hljs-string">}}</span>
         <span class="hljs-attr">secret_key:</span> <span class="hljs-string">${{</span> <span class="hljs-string">secrets.AWS_SECRET_ACCESS_KEY</span> <span class="hljs-string">}}</span>
         <span class="hljs-attr">registry:</span> <span class="hljs-string">${{</span> <span class="hljs-string">secrets.REGISTRY</span> <span class="hljs-string">}}</span>
         <span class="hljs-attr">repo:</span> <span class="hljs-string">${{</span> <span class="hljs-string">env.ECR_REPOSITORY</span> <span class="hljs-string">}}</span>
         <span class="hljs-attr">region:</span> <span class="hljs-string">${{</span> <span class="hljs-string">env.AWS_REGION</span> <span class="hljs-string">}}</span>
         <span class="hljs-attr">tags:</span> <span class="hljs-string">latest,${{</span> <span class="hljs-string">github.run_number</span> <span class="hljs-string">}}</span>
         <span class="hljs-attr">daemon_off:</span> <span class="hljs-literal">false</span>
         <span class="hljs-attr">dockerfile:</span> <span class="hljs-string">./Dockerfile</span>
         <span class="hljs-attr">context:</span> <span class="hljs-string">./</span>

  <span class="hljs-attr">DeployToEKS:</span>
    <span class="hljs-attr">needs:</span> <span class="hljs-string">BUILD_AND_PUBLISH</span>
    <span class="hljs-attr">runs-on:</span> <span class="hljs-string">ubuntu-latest</span>
    <span class="hljs-attr">steps:</span>
      <span class="hljs-comment"># Deployment steps...</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Deploy</span> <span class="hljs-string">Helm</span>
        <span class="hljs-attr">uses:</span> <span class="hljs-string">bitovi/github-actions-deploy-eks-helm@v1.2.8</span>
        <span class="hljs-attr">with:</span>
          <span class="hljs-attr">aws-access-key-id:</span> <span class="hljs-string">${{</span> <span class="hljs-string">secrets.AWS_ACCESS_KEY_ID</span> <span class="hljs-string">}}</span>
          <span class="hljs-attr">aws-secret-access-key:</span> <span class="hljs-string">${{</span> <span class="hljs-string">secrets.AWS_SECRET_ACCESS_KEY</span> <span class="hljs-string">}}</span>
          <span class="hljs-attr">aws-region:</span> <span class="hljs-string">${{</span> <span class="hljs-string">env.AWS_REGION</span> <span class="hljs-string">}}</span>
          <span class="hljs-attr">cluster-name:</span> <span class="hljs-string">${{</span> <span class="hljs-string">env.EKS_CLUSTER</span> <span class="hljs-string">}}</span>
          <span class="hljs-attr">chart-path:</span> <span class="hljs-string">helm/vprofilecharts</span>
          <span class="hljs-attr">namespace:</span> <span class="hljs-string">default</span>
          <span class="hljs-attr">values:</span> <span class="hljs-string">appimage=${{</span> <span class="hljs-string">secrets.REGISTRY</span> <span class="hljs-string">}}/${{</span> <span class="hljs-string">env.ECR_REPOSITORY</span> <span class="hljs-string">}},apptag=${{</span> <span class="hljs-string">github.run_number</span> <span class="hljs-string">}}</span>
          <span class="hljs-attr">name:</span> <span class="hljs-string">vprofile-stack</span>
</code></pre>
<h2 id="heading-the-complete-system-integration-and-flow">The Complete System: Integration and Flow</h2>
<p>Now that we've examined each component individually, let's see how they work together in a complete CI/CD pipeline:</p>
<pre><code class="lang-mermaid">flowchart TD
    subgraph "Developer Workflow"
        IC["Infrastructure Code Changes"]
        AC["Application Code Changes"]

        subgraph "GitHub Repositories"
            IRep["iac-vprofile"]
            ARep["vprofile-action"]
        end
    end

    subgraph "Infrastructure Pipeline"
        IGH["GitHub Actions"]
        TInit["Terraform Init"]
        TPlan["Terraform Plan"]
        TApply["Terraform Apply"]
        KConfig["kubectl Config"]
        NGinst["NGINX Ingress Install"]
    end

    subgraph "Application Pipeline"
        AGH["GitHub Actions"]
        Test["Maven Tests"]
        CS["Checkstyle"]
        SQ["SonarQube Analysis"]
        DocBuild["Docker Build"]
        DocPush["Push to ECR"]
        HDepl["Helm Deployment"]
    end

    subgraph "AWS Infrastructure"
        VPC["AWS VPC"]
        EKS["Amazon EKS"]
        ECR["Amazon ECR"]
        S3["S3 State Bucket"]
    end

    subgraph "Kubernetes Resources"
        ING["Ingress"]
        APP["vProfile App"]
        DB["MySQL"]
        MC["Memcached"]
        RMQ["RabbitMQ"]
    end

    subgraph "End Users"
        User["Users"]
    end

    IC --&gt; IRep
    AC --&gt; ARep

    IRep --&gt; IGH
    IGH --&gt; TInit
    TInit --&gt; TPlan
    TPlan --&gt; TApply
    TApply --&gt; KConfig
    KConfig --&gt; NGinst

    ARep --&gt; AGH
    AGH --&gt; Test
    Test --&gt; CS
    CS --&gt; SQ
    SQ --&gt; DocBuild
    DocBuild --&gt; DocPush
    DocPush --&gt; HDepl

    TApply --&gt; VPC
    TApply --&gt; EKS
    TApply --&gt; S3

    DocPush --&gt; ECR

    HDepl --&gt; ING
    HDepl --&gt; APP
    HDepl --&gt; DB
    HDepl --&gt; MC
    HDepl --&gt; RMQ

    NGinst --&gt; ING
    ING --&gt; APP

    User --&gt; ING
</code></pre>
<p>The workflow proceeds as follows:</p>
<ol>
<li>Infrastructure changes trigger the Terraform workflow to create or update AWS resources</li>
<li>Application changes trigger the application workflow for testing, building, and deployment</li>
<li>The application is deployed to the EKS cluster created by the infrastructure pipeline</li>
<li>Users access the application through the NGINX ingress controller</li>
</ol>
<h2 id="heading-security-and-best-practices">Security and Best Practices</h2>
<p>Throughout this project, we've implemented several security best practices:</p>
<ol>
<li><strong>Least Privilege</strong>: Using IAM roles with minimal permissions</li>
<li><strong>Infrastructure Segregation</strong>: Separating public and private subnets</li>
<li><strong>Secrets Management</strong>: Storing sensitive information in GitHub Secrets</li>
<li><strong>Image Security</strong>: Using multi-stage builds to minimize attack surface</li>
<li><strong>Code Quality</strong>: Implementing automated testing and code analysis</li>
</ol>
<h2 id="heading-challenges-and-learning-outcomes">Challenges and Learning Outcomes</h2>
<p>Building this project presented several interesting challenges:</p>
<ol>
<li><strong>Terraform State Management</strong>: Learning to manage state files securely using S3 backends</li>
<li><strong>Kubernetes Networking</strong>: Understanding the intricacies of Kubernetes ingress and service discovery</li>
<li><strong>CI/CD Integration</strong>: Connecting multiple pipelines with appropriate dependencies</li>
<li><strong>Container Optimization</strong>: Creating efficient Docker images using multi-stage builds</li>
</ol>
<h2 id="heading-conclusion">Conclusion</h2>
<p>The vProfile project demonstrates a comprehensive implementation of modern DevOps principles and practices. By leveraging Infrastructure as Code, containerization, Kubernetes orchestration, and CI/CD pipelines, we've created a robust, scalable, and easily maintainable deployment pipeline.</p>
<p>This approach offers several key benefits:</p>
<ol>
<li><strong>Speed</strong>: Automated deployments reduce time-to-market</li>
<li><strong>Consistency</strong>: Infrastructure and application deployments are reproducible</li>
<li><strong>Scalability</strong>: Kubernetes allows for easy scaling of application components</li>
<li><strong>Maintainability</strong>: Code-based infrastructure and pipelines simplify maintenance</li>
<li><strong>Resilience</strong>: Multi-AZ deployment ensures high availability</li>
</ol>
<p>The knowledge and skills gained from building this project provide a solid foundation for implementing similar architectures in other enterprise contexts. Understanding these DevOps principles from first principles enables you to adapt these patterns to various cloud platforms and application architectures.</p>
<h2 id="heading-future-enhancements">Future Enhancements</h2>
<p>While the current implementation is robust, several enhancements could further improve the system:</p>
<ol>
<li><strong>Multiple Environments</strong>: Extend the infrastructure to support development, staging, and production</li>
<li><strong>Advanced Monitoring</strong>: Implement comprehensive monitoring with Prometheus and Grafana</li>
<li><strong>Service Mesh</strong>: Add Istio or Linkerd for advanced traffic management and security</li>
<li><strong>GitOps</strong>: Implement ArgoCD or Flux for GitOps-based continuous deployment</li>
<li><strong>Automated Testing</strong>: Add more comprehensive integration and end-to-end tests</li>
</ol>
<p>By continuing to evolve this architecture, we can create an even more powerful and flexible DevOps platform.</p>
<hr />
<p><em>This project was developed as a major project for college by Nikhil Mishra. The source code is available in the <a target="_blank" href="https://github.com/kaalpanikh/iac-vprofile">iac-vprofile</a> and <a target="_blank" href="https://github.com/kaalpanikh/vprofile-action">vprofile-action</a> repositories.</em></p>
]]></content:encoded></item><item><title><![CDATA[AushadhiAI: AI-Powered Prescription Analysis System]]></title><description><![CDATA[Introduction
AushadhiAI is an innovative solution that leverages artificial intelligence and Azure Computer Vision to decode doctors' handwritten prescriptions, making medication information accessible and understandable for patients. This technical ...]]></description><link>https://nikhilmishra.xyz/aushadhiai</link><guid isPermaLink="true">https://nikhilmishra.xyz/aushadhiai</guid><category><![CDATA[Azure]]></category><category><![CDATA[AI]]></category><category><![CDATA[Computer Vision]]></category><category><![CDATA[healthcare]]></category><category><![CDATA[OCR ]]></category><category><![CDATA[Python]]></category><category><![CDATA[FastAPI]]></category><category><![CDATA[Cloud]]></category><category><![CDATA[Cloud Computing]]></category><category><![CDATA[AWS]]></category><category><![CDATA[Devops]]></category><category><![CDATA[ML]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Sat, 08 Mar 2025 04:04:22 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1741406449779/41091c67-bc20-42be-ad85-45178e6bd1a0.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>AushadhiAI is an innovative solution that leverages artificial intelligence and Azure Computer Vision to decode doctors' handwritten prescriptions, making medication information accessible and understandable for patients. This technical deep dive explores the architecture, implementation details, and key features of the AushadhiAI system.</p>
<h2 id="heading-system-architecture">System Architecture</h2>
<p>AushadhiAI employs a modern, scalable architecture that separates frontend and backend concerns while leveraging cloud services for advanced AI capabilities.</p>
<pre><code class="lang-mermaid">flowchart TD
    User[User] --&gt;|Uploads Prescription| Frontend[Frontend UI]
    Frontend --&gt;|HTTP Request| Backend[Backend API]
    Backend --&gt;|Image Analysis| AzureCV[Azure Computer Vision]
    Backend --&gt;|Medication Lookup| MedDB[(Medication Database)]
    AzureCV --&gt;|OCR Results| Backend
    Backend --&gt;|JSON Response| Frontend
    Frontend --&gt;|Display Results| User
</code></pre>
<h3 id="heading-key-components">Key Components</h3>
<ol>
<li><strong>Frontend</strong>: HTML, CSS, and JavaScript providing a responsive user interface</li>
<li><strong>Backend API</strong>: FastAPI application providing RESTful endpoints for image analysis  </li>
<li><strong>Azure Vision Service</strong>: Cloud-based OCR through Azure Computer Vision API</li>
<li><strong>Medication Service</strong>: Logic for identifying medications from extracted text</li>
<li><strong>Medication Database</strong>: JSON-based storage of medication information</li>
</ol>
<h2 id="heading-technical-implementation-details">Technical Implementation Details</h2>
<h3 id="heading-backend-system">Backend System</h3>
<p>The backend is built with FastAPI, a modern, high-performance web framework for building APIs with Python. It provides several key endpoints:</p>
<pre><code class="lang-mermaid">classDiagram
    class FastAPIApp {
        +read_root()
        +health_check()
        +get_medications()
        +get_medication_details(name)
        +analyze_prescription(file)
        +get_sample_medications()
        +check_azure()
    }

    class OCRService {
        +extract_text(image_bytes)
    }

    class MedicationService {
        +get_all_medication_names()
        +get_medication_details(name)
    }

    class AzureVisionService {
        +is_available
        +extract_text(image_bytes)
        -_fallback_extract_text(image_bytes)
    }

    class RxNormService {
        +lookup_medication(name)
    }

    FastAPIApp --&gt; OCRService
    FastAPIApp --&gt; MedicationService
    FastAPIApp --&gt; AzureVisionService
    FastAPIApp --&gt; RxNormService
</code></pre>
<h4 id="heading-key-backend-components">Key Backend Components:</h4>
<ol>
<li><strong>app.py</strong>: Main FastAPI application that defines all endpoints</li>
<li><strong>services/azure_vision_service.py</strong>: Handles communication with Azure Computer Vision API</li>
<li><strong>services/ocr_service.py</strong>: Manages text extraction from images with fallback mechanisms</li>
<li><strong>services/med_service.py</strong>: Identifies medications from extracted text</li>
<li><strong>services/rxnorm_service.py</strong>: Integrates with RxNorm for standardized medication information</li>
</ol>
<h3 id="heading-prescription-analysis-process">Prescription Analysis Process</h3>
<p>The prescription analysis workflow involves several steps:</p>
<pre><code class="lang-mermaid">sequenceDiagram
    participant User
    participant Frontend
    participant Backend
    participant AzureVision
    participant MedService

    User-&gt;&gt;Frontend: Upload prescription image
    Frontend-&gt;&gt;Backend: POST /api/analyze
    Backend-&gt;&gt;AzureVision: extract_text(image)
    alt Azure available
        AzureVision--&gt;&gt;Backend: OCR text results
    else Azure unavailable
        AzureVision--&gt;&gt;Backend: Use fallback OCR
    end
    Backend-&gt;&gt;MedService: Find medications in text
    MedService--&gt;&gt;Backend: Medication matches
    Backend--&gt;&gt;Frontend: Analysis results (JSON)
    Frontend--&gt;&gt;User: Display medication information
</code></pre>
<h3 id="heading-frontend-implementation">Frontend Implementation</h3>
<p>The frontend provides an intuitive interface for users to upload and analyze prescriptions:</p>
<pre><code class="lang-mermaid">flowchart LR
    subgraph Frontend
        UI[User Interface] --&gt; Upload
        Upload --&gt; Analysis
        Analysis --&gt; Results
    end

    subgraph Components
        Upload[Upload Component]
        Analysis[Analysis Process]
        Results[Results Display]
    end

    UI --&gt;|User Interaction| Components
</code></pre>
<p>The interface includes:</p>
<ol>
<li><strong>Upload Section</strong>: For prescription image upload</li>
<li><strong>Processing Visualization</strong>: Shows analysis progress</li>
<li><strong>Results Display</strong>: Presents identified medications and details</li>
<li><strong>Responsive Design</strong>: Works across desktop and mobile devices</li>
</ol>
<h2 id="heading-deployment-architecture">Deployment Architecture</h2>
<p>AushadhiAI is deployed using a modern cloud-based infrastructure:</p>
<pre><code class="lang-mermaid">flowchart TD
    subgraph "Frontend Deployment"
        GitHubPages[GitHub Pages]
    end

    subgraph "CI/CD Pipeline"
        GitHubActions[GitHub Actions]
    end

    subgraph "Backend Services"
        ElasticBeanstalk[AWS Elastic Beanstalk]
        ECR[Amazon ECR]
        CloudWatch[AWS CloudWatch]
    end

    subgraph "External Services"
        Azure[Azure Computer Vision]
    end

    GitHubActions --&gt;|Deploy Frontend| GitHubPages
    GitHubActions --&gt;|Deploy Backend| ECR
    ECR --&gt;|Container Image| ElasticBeanstalk
    ElasticBeanstalk --&gt;|Monitoring| CloudWatch
    ElasticBeanstalk --&gt;|API Calls| Azure
    GitHubPages --&gt;|API Requests| ElasticBeanstalk
</code></pre>
<h3 id="heading-deployment-components">Deployment Components:</h3>
<ol>
<li><strong>Frontend</strong>: Hosted on GitHub Pages (static hosting)</li>
<li><strong>Backend</strong>: Containerized with Docker and deployed on AWS Elastic Beanstalk</li>
<li><strong>CI/CD</strong>: Automated deployment using GitHub Actions</li>
<li><strong>Monitoring</strong>: AWS CloudWatch for performance and error tracking</li>
</ol>
<h2 id="heading-system-features">System Features</h2>
<h3 id="heading-1-robust-ocr-capabilities">1. Robust OCR Capabilities</h3>
<p>The system uses Azure Computer Vision API for high-quality OCR, with a fallback mechanism for offline operation:</p>
<pre><code class="lang-mermaid">flowchart TD
    Start[Receive Image] --&gt;|Process| AzureCheck{Azure Available?}
    AzureCheck --&gt;|Yes| AzureOCR[Use Azure Vision API]
    AzureCheck --&gt;|No| LocalOCR[Use Local OCR Fallback]
    AzureOCR --&gt; TextExtraction[Extract Text]
    LocalOCR --&gt; TextExtraction
    TextExtraction --&gt; MedicationIdentification[Identify Medications]
</code></pre>
<h3 id="heading-2-medication-identification">2. Medication Identification</h3>
<p>The system identifies medications using a combination of techniques:</p>
<pre><code class="lang-mermaid">flowchart LR
    OCRText[OCR Text] --&gt; Preprocessing[Text Preprocessing]
    Preprocessing --&gt; NameMatching[Medication Name Matching]
    NameMatching --&gt; Validation[Validation]
    Validation --&gt; DosageExtraction[Dosage Information Extraction]
    DosageExtraction --&gt; Results[Medication Results]
</code></pre>
<h3 id="heading-3-error-handling-and-resilience">3. Error Handling and Resilience</h3>
<p>The system is designed with robust error handling:</p>
<pre><code class="lang-mermaid">flowchart TD
    Request[API Request] --&gt; Validation{Input Valid?}
    Validation --&gt;|Yes| Processing[Process Request]
    Validation --&gt;|No| Error400[Return 400 Error]
    Processing --&gt; ServiceCheck{Services Available?}
    ServiceCheck --&gt;|Yes| SuccessfulResponse[Return Response]
    ServiceCheck --&gt;|No| FallbackMechanism[Use Fallback]
    FallbackMechanism --&gt; LimitedResponse[Return Limited Response]
</code></pre>
<h2 id="heading-performance-considerations">Performance Considerations</h2>
<pre><code class="lang-mermaid">graph LR
    A[Image Upload] --&gt; B[Image Preprocessing]
    B --&gt; C[OCR Processing]
    C --&gt; D[Medication Identification]
    D --&gt; E[Response Generation]

    style A fill:#f9f,stroke:#333,stroke-width:2px
    style C fill:#bbf,stroke:#333,stroke-width:4px
    style D fill:#bbf,stroke:#333,stroke-width:4px
</code></pre>
<p>The most computationally intensive parts of the system are:</p>
<ol>
<li><strong>OCR Processing</strong>: Handled by Azure Computer Vision to offload processing</li>
<li><strong>Medication Identification</strong>: Optimized with efficient text matching algorithms</li>
<li><strong>Image Preprocessing</strong>: Used to enhance OCR accuracy</li>
</ol>
<h2 id="heading-security-implementation">Security Implementation</h2>
<pre><code class="lang-mermaid">flowchart TD
    subgraph "Security Measures"
        CORS[CORS Policy]
        InputValidation[Input Validation]
        APIKeys[API Key Management]
        ErrorHandling[Error Handling]
    end

    Request[User Request] --&gt; CORS
    CORS --&gt; InputValidation
    InputValidation --&gt; Processing[Request Processing]
    Processing --&gt; APIKeys
    APIKeys --&gt; ExternalService[External Services]
    Processing --&gt; ErrorHandling
    ErrorHandling --&gt; Response[Secure Response]
</code></pre>
<p>Key security considerations:</p>
<ol>
<li><strong>CORS Configuration</strong>: Prevents unauthorized cross-origin requests</li>
<li><strong>Input Validation</strong>: Sanitizes and validates all user input</li>
<li><strong>API Key Management</strong>: Securely stores and manages Azure API keys</li>
<li><strong>Error Handling</strong>: Prevents information leakage in error responses</li>
</ol>
<h2 id="heading-future-enhancements">Future Enhancements</h2>
<p>The system is designed for extensibility, with planned enhancements:</p>
<pre><code class="lang-mermaid">timeline
    title Development Roadmap
    Phase 1 : Basic OCR and Medication Identification
    Phase 2 : Detailed Medication Information
    Phase 3 : User Accounts and Prescription History
    Phase 4 : Mobile Application Development
    Phase 5 : Pharmacy System Integration
    Phase 6 : Multi-language Support
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>AushadhiAI represents a powerful application of AI technology to solve real-world healthcare challenges. By combining Azure Computer Vision's advanced OCR capabilities with custom medication identification algorithms, the system effectively bridges the gap between handwritten prescriptions and patient understanding.</p>
<p>The architecture balances performance, reliability, and user experience, with careful consideration given to fallback mechanisms that ensure the system remains functional even when cloud services are unavailable.</p>
<p>Through its modern deployment architecture and thoughtful technical implementation, AushadhiAI demonstrates how cloud-native applications can deliver meaningful solutions to everyday problems. </p>
]]></content:encoded></item><item><title><![CDATA[From Logs to Insights: Unleashing the Power of Nginx Log Analysis]]></title><description><![CDATA[Introduction
In the world of web server operations, log files represent the ground truth of what's happening on your servers. Every request, response, error, and interaction is meticulously recorded, creating a treasure trove of operational intellige...]]></description><link>https://nikhilmishra.xyz/nginx-log-analyzer</link><guid isPermaLink="true">https://nikhilmishra.xyz/nginx-log-analyzer</guid><category><![CDATA[Devops]]></category><category><![CDATA[Logs]]></category><category><![CDATA[nginx]]></category><category><![CDATA[log analysis]]></category><category><![CDATA[infrastructure]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Thu, 06 Mar 2025 06:34:56 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1741242816174/3f03dcfc-1887-4959-a518-19a1eb42e87e.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In the world of web server operations, log files represent the ground truth of what's happening on your servers. Every request, response, error, and interaction is meticulously recorded, creating a treasure trove of operational intelligence waiting to be unlocked. Yet, the sheer volume and cryptic format of these logs make them inaccessible without proper analysis techniques.</p>
<p>This blog post explores the fundamental principles behind log analysis, specifically focusing on Nginx access logs. By taking a first principles approach, we'll deconstruct not just how to analyze logs, but why certain patterns and methodologies yield valuable insights that can transform raw data into actionable information.</p>
<h2 id="heading-understanding-web-server-logs-from-first-principles">Understanding Web Server Logs from First Principles</h2>
<h3 id="heading-the-anatomy-of-a-log-entry">The Anatomy of a Log Entry</h3>
<p>At its most fundamental level, a web server log entry is an event record that captures a specific interaction between a client and a server. Let's break down the standard Nginx combined log format from first principles:</p>
<pre><code><span class="hljs-number">127.0</span><span class="hljs-number">.0</span><span class="hljs-number">.1</span> - frank [<span class="hljs-number">10</span>/Oct/<span class="hljs-number">2023</span>:<span class="hljs-number">13</span>:<span class="hljs-number">55</span>:<span class="hljs-number">36</span> +<span class="hljs-number">0000</span>] <span class="hljs-string">"GET /index.html HTTP/1.1"</span> <span class="hljs-number">200</span> <span class="hljs-number">2326</span> <span class="hljs-string">"http://example.com/start.html"</span> <span class="hljs-string">"Mozilla/5.0 (Windows NT 10.0; Win64; x64)"</span>
</code></pre><p>Each component of this log entry serves a specific purpose:</p>
<pre><code class="lang-mermaid">graph TD
    A[Complete Log Entry] --&gt; B[Client Identity]
    A --&gt; C[Authentication]
    A --&gt; D[Timestamp]
    A --&gt; E[HTTP Request]
    A --&gt; F[Response Status]
    A --&gt; G[Response Size]
    A --&gt; H[Referrer]
    A --&gt; I[User Agent]

    B --&gt; B1[IP Address]
    C --&gt; C1[Basic Auth User]
    D --&gt; D1[Request Time]
    E --&gt; E1[HTTP Method]
    E --&gt; E2[Resource Path]
    E --&gt; E3[Protocol]
    F --&gt; F1[HTTP Status Code]
    G --&gt; G1[Bytes Transferred]
    H --&gt; H1[Referring URL]
    I --&gt; I1[Browser/Client Info]

    style A fill:#f96,stroke:#333,stroke-width:4px
</code></pre>
<p>This structured format represents a deliberate design choice: each field is positioned to capture a specific aspect of the HTTP transaction, creating a comprehensive record that can be parsed and analyzed systematically.</p>
<h2 id="heading-the-information-theory-of-logs">The Information Theory of Logs</h2>
<p>From an information theory perspective, logs represent a form of data compression. Each log entry encodes a complex event (an HTTP transaction with multiple dimensions) into a single line of text. This compression is lossy—not every aspect of the transaction is recorded—but it preserves the most critical information needed for operational analysis.</p>
<p>The challenge lies in extracting and aggregating this information efficiently. This is where our Nginx Log Analyser comes in.</p>
<h2 id="heading-architectural-design-from-first-principles">Architectural Design from First Principles</h2>
<p>The architecture of our log analysis tool follows a pipeline pattern, where data flows through a series of transformations:</p>
<pre><code class="lang-mermaid">flowchart LR
    A[Raw Log File] --&gt; B[Extraction]
    B --&gt; C[Aggregation]
    C --&gt; D[Sorting]
    D --&gt; E[Filtering]
    E --&gt; F[Presentation]

    style A fill:#bbf,stroke:#333,stroke-width:2px
    style F fill:#bfb,stroke:#333,stroke-width:2px
</code></pre>
<p>This pipeline architecture has several inherent advantages:</p>
<ol>
<li>Each stage has a single responsibility</li>
<li>Stages can be optimized independently</li>
<li>The process can be parallelized if needed</li>
<li>New transformations can be added without changing others</li>
</ol>
<p>Let's explore each component of this architecture in detail.</p>
<h2 id="heading-data-extraction-the-foundation-of-analysis">Data Extraction: The Foundation of Analysis</h2>
<p>The first challenge in log analysis is extracting structured data from semi-structured text. Our tool uses the <code>awk</code> command to parse the log format:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Extract IP addresses</span>
awk <span class="hljs-string">'{print $1}'</span> <span class="hljs-string">"<span class="hljs-variable">$LOG_FILE</span>"</span>

<span class="hljs-comment"># Extract request paths</span>
awk -F<span class="hljs-string">'"'</span> <span class="hljs-string">'{print $2}'</span> <span class="hljs-string">"<span class="hljs-variable">$LOG_FILE</span>"</span> | awk <span class="hljs-string">'{print $2}'</span>

<span class="hljs-comment"># Extract status codes</span>
awk <span class="hljs-string">'{print $9}'</span> <span class="hljs-string">"<span class="hljs-variable">$LOG_FILE</span>"</span>

<span class="hljs-comment"># Extract user agents</span>
awk -F<span class="hljs-string">'"'</span> <span class="hljs-string">'{print $6}'</span> <span class="hljs-string">"<span class="hljs-variable">$LOG_FILE</span>"</span>
</code></pre>
<p>This approach demonstrates an important principle: effective data extraction requires understanding the structure of your data. By using field separators (<code>-F'"'</code>) and positional arguments, we can precisely target specific components of the log entry.</p>
<h2 id="heading-aggregation-and-frequency-analysis">Aggregation and Frequency Analysis</h2>
<p>Once we've extracted the raw fields, we need to count occurrences to understand patterns. The <code>sort | uniq -c</code> pipeline is a powerful pattern for frequency analysis:</p>
<pre><code class="lang-bash">awk <span class="hljs-string">'{print $1}'</span> <span class="hljs-string">"<span class="hljs-variable">$LOG_FILE</span>"</span> | sort | uniq -c
</code></pre>
<p>This pattern demonstrates a key principle in data analysis: transforming individual observations into aggregate statistics reveals patterns that are invisible at the individual level.</p>
<p>From a statistical perspective, this is a form of frequency distribution analysis—we're creating a histogram of occurrences to identify the most common values.</p>
<h2 id="heading-data-sorting-and-selection">Data Sorting and Selection</h2>
<p>After aggregation, we need to prioritize the most significant findings:</p>
<pre><code class="lang-bash">sort -rn | head -5
</code></pre>
<p>This simple command pair embodies an important analytical principle: ranking and selection help manage information overload. By sorting numerically in reverse order (<code>-rn</code>) and limiting results (<code>head -5</code>), we focus attention on the most statistically significant patterns.</p>
<h2 id="heading-the-execution-flow-sequence-and-processing">The Execution Flow: Sequence and Processing</h2>
<p>The complete processing flow of our tool follows this sequence:</p>
<pre><code class="lang-mermaid">sequenceDiagram
    participant User
    participant Script
    participant LogFile
    participant AWK
    participant Sort
    participant Uniq
    participant Head

    User-&gt;&gt;Script: Execute with log file
    Script-&gt;&gt;Script: Validate arguments
    Script-&gt;&gt;LogFile: Read log file

    Note over Script,Head: IP Address Analysis
    Script-&gt;&gt;AWK: Extract IP addresses
    AWK-&gt;&gt;Sort: Pipe extracted IPs
    Sort-&gt;&gt;Uniq: Count unique IPs
    Uniq-&gt;&gt;Sort: Sort by frequency
    Sort-&gt;&gt;Head: Select top 5
    Head-&gt;&gt;Script: Return results
    Script-&gt;&gt;User: Display top IPs

    Note over Script,Head: Request Path Analysis
    Script-&gt;&gt;AWK: Extract request paths
    AWK-&gt;&gt;AWK: Process field 2
    AWK-&gt;&gt;Sort: Pipe extracted paths
    Sort-&gt;&gt;Uniq: Count unique paths
    Uniq-&gt;&gt;Sort: Sort by frequency
    Sort-&gt;&gt;Head: Select top 5
    Head-&gt;&gt;Script: Return results
    Script-&gt;&gt;User: Display top paths

    Note over Script,Head: Status Code Analysis
    Script-&gt;&gt;AWK: Extract status codes
    AWK-&gt;&gt;Sort: Pipe extracted codes
    Sort-&gt;&gt;Uniq: Count unique codes
    Uniq-&gt;&gt;Sort: Sort by frequency
    Sort-&gt;&gt;Head: Select top 5
    Head-&gt;&gt;Script: Return results
    Script-&gt;&gt;User: Display top status codes

    Note over Script,Head: User Agent Analysis
    Script-&gt;&gt;AWK: Extract user agents
    AWK-&gt;&gt;Sort: Pipe extracted agents
    Sort-&gt;&gt;Uniq: Count unique agents
    Uniq-&gt;&gt;Sort: Sort by frequency
    Sort-&gt;&gt;Head: Select top 5
    Head-&gt;&gt;Script: Return results
    Script-&gt;&gt;User: Display top user agents
</code></pre>
<p>This sequence diagram reveals an important architectural pattern: the same data transformation pipeline is applied to different aspects of the log data, creating a consistent analytical approach across dimensions.</p>
<h2 id="heading-statistical-insights-from-log-analysis">Statistical Insights from Log Analysis</h2>
<p>The output of our analysis provides four distinct views into server activity:</p>
<ol>
<li><strong>Top IP Addresses</strong>: Identifies potential heavy users, bots, or attackers</li>
<li><strong>Top Requested Paths</strong>: Reveals the most popular content or potential targets</li>
<li><strong>Top Response Codes</strong>: Indicates the overall health and common issues</li>
<li><strong>Top User Agents</strong>: Shows which clients/browsers are most common</li>
</ol>
<p>These four dimensions create a multi-faceted view of server activity:</p>
<pre><code class="lang-mermaid">graph TD
    A[Nginx Log Analysis] --&gt; B[Traffic Sources]
    A --&gt; C[Content Popularity]
    A --&gt; D[Server Health]
    A --&gt; E[Client Demographics]

    B --&gt; B1[Top IP Addresses]
    C --&gt; C1[Top Requested Paths]
    D --&gt; D1[Top Response Codes]
    E --&gt; E1[Top User Agents]

    style A fill:#f96,stroke:#333,stroke-width:4px
</code></pre>
<p>From a data science perspective, this approach demonstrates the power of dimensional analysis—examining the same dataset through different lenses to reveal complementary insights.</p>
<h2 id="heading-the-mathematics-of-log-analysis">The Mathematics of Log Analysis</h2>
<p>The underlying mathematical principles of our analysis are based on frequency counting and ranking. If we consider the set of all log entries L, and a function f that extracts a specific field (such as IP address) from each entry, the counting process can be expressed as:</p>
<p>For any value v in the range of f, the count C(v) is:</p>
<p>C(v) = |{l ∈ L : f(l) = v}|</p>
<p>We then sort these counts in descending order and select the top k values:</p>
<p>TopK(f, L, k) = First k elements of Sort({(v, C(v)) : v in range of f(L)})</p>
<p>This mathematical formulation reveals that our seemingly simple shell commands are implementing a sophisticated statistical aggregation and ranking algorithm.</p>
<h2 id="heading-performance-considerations-the-time-space-tradeoff">Performance Considerations: The Time-Space Tradeoff</h2>
<p>The Unix pipeline architecture of our tool makes an important tradeoff: it processes data sequentially, requiring minimal memory but potentially more CPU time. For most log files, this is an appropriate tradeoff, as it allows analysis of logs too large to fit in memory.</p>
<p>The time complexity of our approach is approximately:</p>
<ul>
<li>Extraction: O(n) where n is the number of log entries</li>
<li>Sorting: O(n log n)</li>
<li>Counting: O(n)</li>
<li>Final sorting: O(k log k) where k is the number of unique values</li>
<li>Selection: O(k)</li>
</ul>
<p>The dominant factor is the O(n log n) sorting step, which is necessary for the frequency analysis.</p>
<h2 id="heading-beyond-basic-analysis-a-path-forward">Beyond Basic Analysis: A Path Forward</h2>
<p>From our first principles analysis, several natural extensions emerge:</p>
<pre><code class="lang-mermaid">mindmap
  root((Log Analysis))
    Temporal Analysis
      Request patterns by hour
      Daily traffic trends
      Session duration
    Geographic Insights
      IP geolocation
      Regional traffic patterns
    Performance Metrics
      Response time analysis
      Bandwidth consumption
      Cache effectiveness
    Security Analysis
      Attack pattern detection
      Anomaly identification
      Bot traffic filtering
    Content Analysis
      Path structure mapping
      Content popularity by type
      Error frequency by section
    User Behavior
      Session tracking
      Navigation paths
      Conversion funnels
</code></pre>
<p>Each of these extensions builds upon the fundamental principles established in our base tool but adds new dimensions of analysis that can provide deeper insights.</p>
<h2 id="heading-shell-scripting-as-a-data-science-tool">Shell Scripting as a Data Science Tool</h2>
<p>It's worth noting that our approach uses basic Unix shell commands to perform what would typically be considered data science tasks:</p>
<ul>
<li>Data extraction (awk)</li>
<li>Aggregation (sort, uniq)</li>
<li>Sorting and ranking (sort -rn)</li>
<li>Selection (head)</li>
</ul>
<p>This demonstrates an important principle: powerful analysis doesn't always require complex tools. The Unix philosophy of "small tools that do one thing well" creates a flexible analytical framework when these tools are combined effectively.</p>
<h2 id="heading-conclusion-from-logs-to-insight">Conclusion: From Logs to Insight</h2>
<p>By approaching log analysis from first principles, we've seen how a seemingly simple task—counting occurrences in a text file—can reveal sophisticated patterns and insights about web server operation. Our Nginx Log Analyser demonstrates how fundamental computational and statistical techniques can transform raw data into actionable intelligence.</p>
<p>The power of this approach lies not just in what it can tell us about our servers today, but in how it establishes a foundation for more sophisticated analysis. By understanding the basic principles of extraction, aggregation, and ranking, we can build increasingly powerful analytical tools that help us understand and optimize our web infrastructure.</p>
<p>In a world increasingly driven by data, the ability to extract meaningful patterns from raw logs is not just a technical skill—it's a competitive advantage. By mastering these fundamental techniques, we unlock the valuable information hidden in plain sight in our server logs.</p>
<h2 id="heading-about-the-author">About the Author</h2>
<p>I'm a DevOps engineer and systems architect passionate about applying first principles thinking to infrastructure analysis and optimization. This project is part of the roadmap.sh learning path for server administration and monitoring.</p>
<hr />
<p><em>For more information about log analysis best practices, visit <a target="_blank" href="https://roadmap.sh/projects/nginx-log-analyser">roadmap.sh/projects/nginx-log-analyser</a></em></p>
]]></content:encoded></item><item><title><![CDATA[Log Management & Archiving: A First Principles Deep Dive]]></title><description><![CDATA[Introduction
Log files are the silent sentinels of our systems—recording events, errors, and activities that are crucial for troubleshooting, security auditing, and compliance. Yet, they present a unique challenge: they're essential to keep, but they...]]></description><link>https://nikhilmishra.xyz/log-archive-tool</link><guid isPermaLink="true">https://nikhilmishra.xyz/log-archive-tool</guid><category><![CDATA[log management]]></category><category><![CDATA[Devops]]></category><category><![CDATA[SRE]]></category><category><![CDATA[#cybersecurity]]></category><category><![CDATA[System Design]]></category><category><![CDATA[archiving legacy data]]></category><category><![CDATA[infrastructure]]></category><category><![CDATA[automation]]></category><category><![CDATA[first-principle]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Wed, 05 Mar 2025 11:22:58 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1741173599690/9c22378c-ec6d-48d9-8974-d84d5cf1f874.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>Log files are the silent sentinels of our systems—recording events, errors, and activities that are crucial for troubleshooting, security auditing, and compliance. Yet, they present a unique challenge: they're essential to keep, but they consume resources and can quickly become unwieldy. In this blog post, we'll explore the fundamental principles behind efficient log management and how our Log Archive Tool addresses these challenges from first principles.</p>
<h2 id="heading-understanding-log-management-from-first-principles">Understanding Log Management from First Principles</h2>
<h3 id="heading-the-fundamental-problem-space">The Fundamental Problem Space</h3>
<p>At its core, log management is about balancing several competing concerns:</p>
<ol>
<li><strong>Information Preservation</strong>: Maintaining historical records of system activities and events</li>
<li><strong>Resource Optimization</strong>: Preventing log files from consuming excessive disk space</li>
<li><strong>Retrieval Efficiency</strong>: Ensuring logs remain accessible when needed for analysis</li>
<li><strong>Security &amp; Compliance</strong>: Safeguarding sensitive information while meeting retention requirements</li>
</ol>
<p>Rather than treating log management as a mundane operational task, we can approach it as an information lifecycle management problem with specific constraints and objectives.</p>
<h2 id="heading-the-log-lifecycle">The Log Lifecycle</h2>
<p>From a first principles perspective, logs undergo a predictable lifecycle:</p>
<pre><code class="lang-mermaid">graph LR
    A[Log Creation] --&gt; B[Active Use]
    B --&gt; C[Retention Period]
    C --&gt; D[Archive]
    D --&gt; E[Eventual Disposal]

    style A fill:#f9f,stroke:#333,stroke-width:2px
    style B fill:#bbf,stroke:#333,stroke-width:2px
    style C fill:#bfb,stroke:#333,stroke-width:2px
    style D fill:#fbb,stroke:#333,stroke-width:2px
    style E fill:#ddd,stroke:#333,stroke-width:2px
</code></pre>
<p>Understanding this lifecycle reveals why most log management solutions fail: they focus on only one part of this cycle (usually creation and active use) without considering the entire information flow.</p>
<h2 id="heading-technical-architecture-of-the-log-archive-tool">Technical Architecture of the Log Archive Tool</h2>
<p>Our Log Archive Tool approaches the problem holistically, addressing multiple stages of the log lifecycle in a single, cohesive solution:</p>
<pre><code class="lang-mermaid">flowchart TB
    A[Log Archive Tool] --&gt; B[Archiving Engine]
    A --&gt; C[Notification System]
    A --&gt; D[Secure Transfer Mechanism]

    B --&gt; B1[File Compression]
    B --&gt; B2[Timestamp Generation]
    B --&gt; B3[Archive Creation]

    C --&gt; C1[Email Notification]
    C --&gt; C2[Status Reporting]

    D --&gt; D1[SSH Authentication]
    D --&gt; D2[SCP Transfer]
    D --&gt; D3[Remote Storage]

    style A fill:#f96,stroke:#333,stroke-width:2px
    style B fill:#9cf,stroke:#333,stroke-width:2px
    style C fill:#9f9,stroke:#333,stroke-width:2px
    style D fill:#f99,stroke:#333,stroke-width:2px
</code></pre>
<p>This architecture separates concerns while ensuring that each component works in concert with the others. Let's explore each component in detail.</p>
<h2 id="heading-compression-information-theory-in-practice">Compression: Information Theory in Practice</h2>
<p>From an information theory perspective, log files are ideal candidates for compression because they often contain repeated patterns and redundant information. Our tool leverages the gzip compression algorithm through tar:</p>
<pre><code class="lang-bash">sudo tar -czf <span class="hljs-string">"<span class="hljs-variable">$ARCHIVE_DIR</span>/<span class="hljs-variable">$ARCHIVE_FILENAME</span>"</span> -C <span class="hljs-string">"<span class="hljs-variable">$LOG_DIR</span>"</span> .
</code></pre>
<p>This single line represents a sophisticated compression process:</p>
<ul>
<li>The <code>-c</code> flag creates a new archive</li>
<li>The <code>-z</code> flag applies gzip compression</li>
<li>The <code>-f</code> flag specifies the output file</li>
</ul>
<p>By compressing log files, we typically achieve compression ratios of 5:1 to 10:1, dramatically reducing storage requirements while preserving all information. This is not just a practical benefit but a fundamental application of information theory.</p>
<h2 id="heading-timestamp-generation-the-importance-of-temporal-context">Timestamp Generation: The Importance of Temporal Context</h2>
<p>Time is a critical dimension in log analysis. Our tool generates unique timestamps using the command:</p>
<pre><code class="lang-bash">TIMESTAMP=$(date +<span class="hljs-string">"%Y%m%d_%H%M%S"</span>)
</code></pre>
<p>This format ensures:</p>
<ol>
<li><strong>Chronological sorting</strong>: Archives naturally sort in chronological order</li>
<li><strong>Unambiguous identification</strong>: Each archive has a unique identifier</li>
<li><strong>Human readability</strong>: The format is easily interpretable</li>
</ol>
<p>From first principles, this timestamp serves as both an identifier and metadata, embedding temporal context directly into the artifact name.</p>
<h2 id="heading-the-execution-flow-a-sequence-based-approach">The Execution Flow: A Sequence-Based Approach</h2>
<p>The tool's execution flow follows a logical sequence that minimizes failure points and ensures data integrity:</p>
<pre><code class="lang-mermaid">sequenceDiagram
    participant User
    participant Script
    participant LocalSystem
    participant Email
    participant RemoteSystem

    User-&gt;&gt;Script: Execute with parameters
    Script-&gt;&gt;Script: Validate input parameters
    Script-&gt;&gt;LocalSystem: Check directory existence
    LocalSystem--&gt;&gt;Script: Directory status

    alt Directory does not exist
        Script-&gt;&gt;User: Error message
    else Directory exists
        Script-&gt;&gt;LocalSystem: Create archive directory
        Script-&gt;&gt;LocalSystem: Generate timestamp
        Script-&gt;&gt;LocalSystem: Compress logs
        LocalSystem--&gt;&gt;Script: Compression complete

        Script-&gt;&gt;Email: Send notification
        Email--&gt;&gt;Script: Email sent

        Script-&gt;&gt;Script: Parse remote destination
        Script-&gt;&gt;RemoteSystem: Transfer archive via SCP
        RemoteSystem--&gt;&gt;Script: Transfer status

        Script-&gt;&gt;User: Display completion status
    end
</code></pre>
<p>This sequence diagram reveals an important architectural principle: the tool handles error conditions early and proceeds only when preconditions are met, creating a robust execution path.</p>
<h2 id="heading-remote-backup-the-distributed-systems-approach">Remote Backup: The Distributed Systems Approach</h2>
<p>Perhaps the most sophisticated aspect of our tool is its approach to distributed storage. By leveraging SCP (Secure Copy Protocol), the tool ensures:</p>
<ol>
<li><strong>Data Integrity</strong>: Files are transferred without corruption</li>
<li><strong>Security</strong>: All data is encrypted during transit</li>
<li><strong>Authentication</strong>: SSH key-based authentication prevents unauthorized access</li>
</ol>
<p>The implementation parses the remote destination string to extract the necessary components:</p>
<pre><code class="lang-bash">REMOTE_USER=$(<span class="hljs-built_in">echo</span> <span class="hljs-string">"<span class="hljs-variable">$REMOTE_BACKUP</span>"</span> | cut -d<span class="hljs-string">'@'</span> -f1)
REMOTE_HOST=$(<span class="hljs-built_in">echo</span> <span class="hljs-string">"<span class="hljs-variable">$REMOTE_BACKUP</span>"</span> | cut -d<span class="hljs-string">'@'</span> -f2 | cut -d<span class="hljs-string">':'</span> -f1)
REMOTE_PATH=$(<span class="hljs-built_in">echo</span> <span class="hljs-string">"<span class="hljs-variable">$REMOTE_BACKUP</span>"</span> | cut -d<span class="hljs-string">':'</span> -f2)
</code></pre>
<p>This parsing demonstrates a key principle in distributed systems: the separation of identity (user), location (host), and storage (path) as distinct components that together form a complete resource identifier.</p>
<h2 id="heading-notification-system-closing-the-feedback-loop">Notification System: Closing the Feedback Loop</h2>
<p>From a cybernetic perspective, any effective system requires feedback loops. Our notification system serves this purpose:</p>
<pre><code class="lang-bash">{
    <span class="hljs-built_in">echo</span> <span class="hljs-string">"Subject: Log Archive Notification"</span>
    <span class="hljs-built_in">echo</span> <span class="hljs-string">"To: <span class="hljs-variable">$EMAIL</span>"</span>
    <span class="hljs-built_in">echo</span> <span class="hljs-string">"Content-Type: text/plain"</span>
    <span class="hljs-built_in">echo</span> <span class="hljs-string">""</span>
    <span class="hljs-built_in">echo</span> <span class="hljs-string">"Logs have been archived successfully on <span class="hljs-subst">$(date)</span>."</span>
} | sendmail -t
</code></pre>
<p>This seemingly simple notification achieves several important functions:</p>
<ol>
<li>It confirms successful operation</li>
<li>It provides an audit trail of archiving activities</li>
<li>It alerts administrators to potential issues if expected notifications don't arrive</li>
</ol>
<h2 id="heading-system-integration-architecture">System Integration Architecture</h2>
<p>When viewed as part of a broader system, our Log Archive Tool occupies a specific position in the infrastructure architecture:</p>
<pre><code class="lang-mermaid">graph TD
    A[Application Servers] --&gt; B[Log Files]
    B --&gt; C[Log Archive Tool]
    C --&gt; D[Local Archive]
    C --&gt; E[Email Notification]
    C --&gt; F[Remote Backup Storage]

    G[Monitoring System] -.-&gt; E
    H[Disaster Recovery] -.-&gt; F
    I[Compliance Audit] -.-&gt; F

    style C fill:#f96,stroke:#333,stroke-width:4px
</code></pre>
<p>This architectural view reveals how our tool serves as a critical junction between active systems and various downstream consumers of log data, from monitoring systems to compliance auditors.</p>
<h2 id="heading-security-considerations-from-first-principles">Security Considerations from First Principles</h2>
<p>Security is not an add-on but a fundamental aspect of any log management solution. From first principles, we can identify several security requirements:</p>
<ol>
<li><strong>Confidentiality</strong>: Logs often contain sensitive information</li>
<li><strong>Integrity</strong>: Logs must not be tampered with</li>
<li><strong>Availability</strong>: Logs must be accessible when needed</li>
<li><strong>Non-repudiation</strong>: The authenticity of logs must be verifiable</li>
</ol>
<p>Our tool addresses these through:</p>
<ul>
<li>Executing with elevated permissions (<code>sudo</code>) to access protected logs</li>
<li>Using SSH for secure, encrypted transfers</li>
<li>Preserving file ownership and permissions during archiving</li>
<li>Creating immutable archives with timestamps</li>
</ul>
<h2 id="heading-future-extensions-evolutionary-architecture">Future Extensions: Evolutionary Architecture</h2>
<p>From our first principles analysis, several natural extensions emerge:</p>
<pre><code class="lang-mermaid">mindmap
  root((Log Archive Tool))
    Retention Policies
      Time-based expiration
      Space-based cleanup
    Enhanced Compression
      Deduplication
      Differential archiving
    Security Enhancements
      Cryptographic signing
      Encryption at rest
    Analytics Integration
      Automated log parsing
      Anomaly detection
    Scalability
      Multi-server coordination
      Distributed processing
    Cloud Integration
      S3/Azure/GCP storage
      Serverless triggers
</code></pre>
<p>These extensions follow naturally from the core principles we've established, showing how a first principles approach enables organic, coherent system evolution.</p>
<h2 id="heading-conclusion-the-art-of-log-management">Conclusion: The Art of Log Management</h2>
<p>By approaching log management from first principles, we've transformed what might seem like a mundane operational task into a sophisticated information lifecycle management solution. Our Log Archive Tool embodies these principles through:</p>
<ol>
<li><strong>Efficiency</strong>: Minimizing resource usage through compression</li>
<li><strong>Reliability</strong>: Ensuring logs are preserved through multiple mechanisms</li>
<li><strong>Security</strong>: Protecting sensitive information throughout the process</li>
<li><strong>Usability</strong>: Simplifying operations with clear interfaces and feedback</li>
</ol>
<p>The true art of systems engineering lies not in complexity but in finding elegant solutions to fundamental problems. By understanding the first principles of log management, we've created a tool that's both powerful and remarkably simple.</p>
<h2 id="heading-about-the-author">About the Author</h2>
<p>I'm a systems engineer passionate about infrastructure automation and applying first principles thinking to DevOps challenges. This project is part of the roadmap.sh learning path for server administration.</p>
<hr />
<p><em>For more information about log management best practices, visit <a target="_blank" href="https://roadmap.sh/projects/log-archive-tool">roadmap.sh/projects/log-archive-tool</a></em></p>
]]></content:encoded></item><item><title><![CDATA[Server Monitoring From First Principles: Building a Custom Server-Stats Tool]]></title><description><![CDATA[Introduction
In today's complex IT infrastructure, understanding server performance is not just a convenience—it's a necessity. This blog post explores the fundamental principles behind server monitoring and dives deep into how we built a comprehensi...]]></description><link>https://nikhilmishra.xyz/server-stats</link><guid isPermaLink="true">https://nikhilmishra.xyz/server-stats</guid><category><![CDATA[remote server monitoring]]></category><category><![CDATA[Devops]]></category><category><![CDATA[IT Infrastructure]]></category><category><![CDATA[Linux]]></category><category><![CDATA[System administration]]></category><category><![CDATA[Performance Optimization]]></category><category><![CDATA[metrics]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Tue, 04 Mar 2025 11:04:56 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1741086067901/77a75300-1a86-4bbd-93fd-39b7ea1934a9.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In today's complex IT infrastructure, understanding server performance is not just a convenience—it's a necessity. This blog post explores the fundamental principles behind server monitoring and dives deep into how we built a comprehensive server statistics tool from scratch.</p>
<p>As engineers, we often rely on sophisticated monitoring tools without understanding their inner workings. By breaking down our approach to first principles, we'll gain insights into not just how to monitor servers, but why specific metrics matter and how they interrelate.</p>
<h2 id="heading-understanding-the-fundamentals-of-server-monitoring">Understanding the Fundamentals of Server Monitoring</h2>
<h3 id="heading-why-monitor-servers">Why Monitor Servers?</h3>
<p>At its core, server monitoring solves several critical problems:</p>
<ol>
<li><strong>Proactive Issue Detection</strong>: Identifying problems before they impact users</li>
<li><strong>Performance Optimization</strong>: Finding bottlenecks that limit system capability</li>
<li><strong>Capacity Planning</strong>: Understanding resource utilization trends</li>
<li><strong>Security Oversight</strong>: Detecting unusual patterns that may indicate breaches</li>
</ol>
<p>The foundation of effective monitoring lies in knowing which metrics truly matter. Let's break this down by examining the fundamental resources every server manages.</p>
<h2 id="heading-the-four-pillars-of-server-resources">The Four Pillars of Server Resources</h2>
<p>From first principles, every server manages four essential resources:</p>
<pre><code class="lang-mermaid">graph TD
    A[Server Resources] --&gt; B[CPU]
    A --&gt; C[Memory]
    A --&gt; D[Disk]
    A --&gt; E[Network]

    B --&gt; B1[Processing Power]
    B --&gt; B2[Task Scheduling]

    C --&gt; C1[Data Storage]
    C --&gt; C2[Application State]

    D --&gt; D1[Persistent Storage]
    D --&gt; D2[I/O Operations]

    E --&gt; E1[Data Transfer]
    E --&gt; E2[Communication]
</code></pre>
<p>Understanding how these resources interact and depend on each other is crucial. For example, insufficient memory can lead to excessive disk swapping, creating an I/O bottleneck that appears as a disk problem but originates from memory constraints.</p>
<h2 id="heading-architecture-of-our-monitoring-solution">Architecture of Our Monitoring Solution</h2>
<p>Our server-stats tool follows a modular design pattern where each component focuses on monitoring a specific aspect of the system:</p>
<pre><code class="lang-mermaid">flowchart TD
    A[server-stats.sh] --&gt; B[System Information]
    A --&gt; C[Resource Monitoring]
    A --&gt; D[Security Metrics]

    B --&gt; B1[OS Version]
    B --&gt; B2[System Uptime]
    B --&gt; B3[Load Average]
    B --&gt; B4[Current Date/Time]

    C --&gt; C1[CPU Usage]
    C --&gt; C2[Memory Utilization]
    C --&gt; C3[Disk Space Analysis]
    C --&gt; C4[Top Processes]

    D --&gt; D1[Active User Sessions]
    D --&gt; D2[Failed Login Attempts]
    D --&gt; D3[Auth Log Analysis]
</code></pre>
<p>This architecture allows for:</p>
<ul>
<li>Independent development of each module</li>
<li>Easy extensibility to add new metrics</li>
<li>Clear separation of concerns</li>
</ul>
<h2 id="heading-deep-dive-implementation-from-first-principles">Deep Dive: Implementation From First Principles</h2>
<h3 id="heading-cpu-monitoring">CPU Monitoring</h3>
<p>From first principles, CPU usage is fundamentally about time allocation. When we measure CPU percentage, we're asking: "Of the total available CPU time, how much was spent on actual work versus waiting?"</p>
<p>The Linux kernel tracks CPU time in multiple categories:</p>
<ul>
<li><strong>user</strong>: Time spent running user space processes</li>
<li><strong>nice</strong>: Time spent running niced processes (with adjusted priority)</li>
<li><strong>system</strong>: Time spent in kernel operations</li>
<li><strong>idle</strong>: Time when CPU had nothing to process</li>
<li><strong>iowait</strong>: Time spent waiting for I/O operations</li>
<li><strong>irq/softirq</strong>: Time handling interrupts</li>
</ul>
<p>Our implementation extracts this data directly from <code>/proc/stat</code>:</p>
<pre><code class="lang-bash"><span class="hljs-keyword">function</span> cpu_usage {
    <span class="hljs-built_in">echo</span> <span class="hljs-string">"=== Total CPU Usage ==="</span>
    cpu_info=$(grep <span class="hljs-string">'cpu '</span> /proc/<span class="hljs-built_in">stat</span>)
    user=$(<span class="hljs-built_in">echo</span> <span class="hljs-variable">$cpu_info</span> | awk <span class="hljs-string">'{print $2}'</span>)
    nice=$(<span class="hljs-built_in">echo</span> <span class="hljs-variable">$cpu_info</span> | awk <span class="hljs-string">'{print $3}'</span>)
    system=$(<span class="hljs-built_in">echo</span> <span class="hljs-variable">$cpu_info</span> | awk <span class="hljs-string">'{print $4}'</span>)
    idle=$(<span class="hljs-built_in">echo</span> <span class="hljs-variable">$cpu_info</span> | awk <span class="hljs-string">'{print $5}'</span>)

    total=$((user + nice + system + idle))
    used=$((user + nice + system))

    cpu_percentage=$((<span class="hljs-number">100</span> * used / total))

    <span class="hljs-built_in">echo</span> <span class="hljs-string">"CPU Usage: <span class="hljs-variable">$cpu_percentage</span>%"</span>
    <span class="hljs-built_in">echo</span> <span class="hljs-string">""</span>
}
</code></pre>
<p>While simplified, this function captures the essence of CPU monitoring by calculating the ratio of active time (user + nice + system) to total time.</p>
<h3 id="heading-memory-monitoring-from-first-principles">Memory Monitoring From First Principles</h3>
<p>Memory is fundamentally about allocation of finite storage space. The key insight is understanding the difference between available memory, used memory, and how the system manages memory pressure.</p>
<p>In Linux, the <code>free</code> command provides this information in an accessible format:</p>
<pre><code class="lang-bash"><span class="hljs-keyword">function</span> memory_usage {
    <span class="hljs-built_in">echo</span> <span class="hljs-string">"=== Total Memory Usage ==="</span>
    mem_info=$(free -m)

    total_mem=$(<span class="hljs-built_in">echo</span> <span class="hljs-string">"<span class="hljs-variable">$mem_info</span>"</span> | awk <span class="hljs-string">'NR==2{print $2}'</span>)
    used_mem=$(<span class="hljs-built_in">echo</span> <span class="hljs-string">"<span class="hljs-variable">$mem_info</span>"</span> | awk <span class="hljs-string">'NR==2{print $3}'</span>)

    free_mem=$((total_mem - used_mem))

    mem_percentage=$((<span class="hljs-number">100</span> * used_mem / total_mem))

    <span class="hljs-built_in">echo</span> <span class="hljs-string">"Total Memory: <span class="hljs-variable">${total_mem}</span>MB"</span>
    <span class="hljs-built_in">echo</span> <span class="hljs-string">"Used Memory: <span class="hljs-variable">${used_mem}</span>MB"</span>
    <span class="hljs-built_in">echo</span> <span class="hljs-string">"Free Memory: <span class="hljs-variable">${free_mem}</span>MB"</span>
    <span class="hljs-built_in">echo</span> <span class="hljs-string">"Memory Usage Percentage: <span class="hljs-variable">${mem_percentage}</span>%"</span>
    <span class="hljs-built_in">echo</span> <span class="hljs-string">""</span>
}
</code></pre>
<p>It's worth noting that modern Linux kernels have sophisticated memory management that includes caching frequently used data. A more comprehensive analysis would distinguish between memory used by applications and memory used for cache, which can be released if needed.</p>
<h2 id="heading-execution-flow">Execution Flow</h2>
<p>The overall execution flow of our server monitoring tool follows a sequential pattern:</p>
<pre><code class="lang-mermaid">sequenceDiagram
    participant User
    participant Script
    participant System

    User-&gt;&gt;Script: Execute ./server-stats.sh
    Script-&gt;&gt;System: Request date information
    System--&gt;&gt;Script: Return current date
    Script-&gt;&gt;User: Display date

    Script-&gt;&gt;System: Request OS version
    System--&gt;&gt;Script: Return OS details
    Script-&gt;&gt;User: Display OS version

    Script-&gt;&gt;System: Request uptime data
    System--&gt;&gt;Script: Return uptime
    Script-&gt;&gt;User: Display uptime

    Script-&gt;&gt;System: Check load average
    System--&gt;&gt;Script: Return load statistics
    Script-&gt;&gt;User: Display load average

    Script-&gt;&gt;System: Query user sessions
    System--&gt;&gt;Script: Return active sessions
    Script-&gt;&gt;User: Display logged in users

    Script-&gt;&gt;System: Check security logs
    System--&gt;&gt;Script: Return failed login count
    Script-&gt;&gt;User: Display failed attempts

    Script-&gt;&gt;System: Request CPU statistics
    System--&gt;&gt;Script: Return CPU data
    Script-&gt;&gt;User: Display CPU usage

    Script-&gt;&gt;System: Query memory allocation
    System--&gt;&gt;Script: Return memory statistics
    Script-&gt;&gt;User: Display memory usage

    Script-&gt;&gt;System: Request disk information
    System--&gt;&gt;Script: Return disk statistics
    Script-&gt;&gt;User: Display disk usage

    Script-&gt;&gt;System: Query resource-intensive processes
    System--&gt;&gt;Script: Return top processes
    Script-&gt;&gt;User: Display top processes
</code></pre>
<h2 id="heading-security-considerations">Security Considerations</h2>
<p>From first principles, system security involves detecting anomalies and unauthorized access attempts. Our tool incorporates basic security monitoring:</p>
<ol>
<li><strong>Active Sessions</strong>: Shows who is currently logged in, allowing administrators to identify unexpected users</li>
<li><strong>Failed Login Attempts</strong>: A sudden increase in failed logins often indicates a brute force attack</li>
</ol>
<pre><code class="lang-bash"><span class="hljs-keyword">function</span> failed_login_attempts {
    <span class="hljs-built_in">echo</span> <span class="hljs-string">"=== Failed Login Attempts ==="</span>
    sudo cat /var/<span class="hljs-built_in">log</span>/auth.log | grep <span class="hljs-string">'Failed password'</span> | wc -l
    <span class="hljs-built_in">echo</span> <span class="hljs-string">""</span>
}
</code></pre>
<p>A more comprehensive solution would include:</p>
<ul>
<li>Tracking login attempts by IP address</li>
<li>Monitoring for privilege escalation</li>
<li>Detecting unusual file system access patterns</li>
<li>Checking for modifications to critical system files</li>
</ul>
<h2 id="heading-extending-the-system-future-directions">Extending The System: Future Directions</h2>
<p>From our first principles approach, several enhancements naturally emerge:</p>
<pre><code class="lang-mermaid">mindmap
  root((Server Monitoring))
    Real-time Monitoring
      Continuous data collection
      Time-series visualization
    Alerting System
      Threshold-based alerts
      Anomaly detection
    Historical Analysis
      Performance trending
      Capacity forecasting
    Network Monitoring
      Bandwidth utilization
      Connection tracking
    Service Monitoring
      API health checks
      Database performance
    Distributed Systems
      Cross-server correlation
      Service mesh analysis
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Building a server monitoring system from first principles reveals the fundamental relationships between computing resources and provides deeper insights into system behavior. Our server-stats tool, while simple, demonstrates the core concepts behind effective monitoring.</p>
<p>By understanding the why behind each metric, not just the how, we develop more intuitive insights into server performance and can make better-informed decisions about optimization, scaling, and troubleshooting.</p>
<p>The journey from raw system data to actionable intelligence begins with these fundamentals. As we build more sophisticated monitoring solutions, these principles remain the foundation upon which all effective observability is built.</p>
<h2 id="heading-about-the-author">About The Author</h2>
<p>I'm a system engineer and DevOps enthusiast passionate about understanding complex systems from first principles. This project is part of the roadmap.sh learning path for server administration and monitoring.</p>
<hr />
<p><em>For more information about server monitoring and best practices, visit <a target="_blank" href="https://roadmap.sh/projects/server-stats">roadmap.sh/projects/server-stats</a></em></p>
<p><em>Visit Github repo for the code, visit (https://github.com/kaalpanikh/server-stats)</em></p>
]]></content:encoded></item><item><title><![CDATA[Securing the Cloud: Mastering SSH Access on AWS]]></title><description><![CDATA[Introduction
In the world of server administration and cloud computing, secure remote access is of paramount importance. This blog post guides you through the entire process of setting up secure SSH (Secure Shell) access to a remote Linux server on A...]]></description><link>https://nikhilmishra.xyz/securing-the-cloud-mastering-ssh-access-on-aws</link><guid isPermaLink="true">https://nikhilmishra.xyz/securing-the-cloud-mastering-ssh-access-on-aws</guid><category><![CDATA[Remote Server Setup]]></category><category><![CDATA[AWS]]></category><category><![CDATA[ec2]]></category><category><![CDATA[ssh]]></category><category><![CDATA[Linux]]></category><category><![CDATA[Cloud Computing]]></category><category><![CDATA[#cybersecurity]]></category><category><![CDATA[cloud server security]]></category><category><![CDATA[fail2ban]]></category><category><![CDATA[SSH Key Management]]></category><category><![CDATA[public-key cryptgraphy]]></category><category><![CDATA[authentication]]></category><category><![CDATA[firewall]]></category><category><![CDATA[encryption]]></category><category><![CDATA[Devops]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Mon, 03 Mar 2025 10:53:21 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740998935030/45f41513-2997-45a4-a201-a4cf186e12f8.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In the world of server administration and cloud computing, secure remote access is of paramount importance. This blog post guides you through the entire process of setting up secure SSH (Secure Shell) access to a remote Linux server on AWS, using multiple SSH key pairs for authentication, and implementing additional security measures like fail2ban to protect against brute force attacks.</p>
<p>By the end of this guide, you'll have a comprehensive understanding of SSH key management, server configuration, and security best practices that you can apply to your own projects.</p>
<h2 id="heading-understanding-ssh-from-first-principles">Understanding SSH from First Principles</h2>
<h3 id="heading-what-is-ssh">What is SSH?</h3>
<p>SSH (Secure Shell) is a cryptographic network protocol that enables secure communication between two computers over an unsecured network. Unlike its predecessors like Telnet, SSH encrypts all traffic, protecting against eavesdropping and man-in-the-middle attacks.</p>
<h3 id="heading-the-cryptographic-foundation-of-ssh">The Cryptographic Foundation of SSH</h3>
<p>SSH security is built on public-key cryptography. This system uses a pair of keys:</p>
<ol>
<li><strong>Private Key</strong>: Kept secret on your local machine</li>
<li><strong>Public Key</strong>: Shared with remote servers</li>
</ol>
<p>These keys work together through asymmetric encryption:</p>
<ul>
<li>Messages encrypted with the public key can only be decrypted with the corresponding private key</li>
<li>The private key is used to generate digital signatures that can be verified with the public key</li>
</ul>
<p>This creates a secure system where:</p>
<ul>
<li>The server knows it's talking to the authorized client (authentication)</li>
<li>All communication is encrypted (confidentiality)</li>
<li>Messages cannot be altered in transit (integrity)</li>
</ul>
<pre><code class="lang-mermaid">sequenceDiagram
    Client-&gt;&gt;Server: 1. Connection Request
    Server-&gt;&gt;Client: 2. Server Identity
    Client-&gt;&gt;Server: 3. Key Exchange
    Client-&gt;&gt;Server: 4. Authentication with Private Key
    Server-&gt;&gt;Client: 5. Authentication Verification
    Client-&gt;&gt;Server: 6. Encrypted Session Begins
</code></pre>
<h2 id="heading-project-overview-remote-server-setup-with-multiple-ssh-keys">Project Overview: Remote Server Setup with Multiple SSH Keys</h2>
<p>Let's implement these concepts by setting up a remote Linux server on AWS with secure SSH access using two separate SSH key pairs. This approach demonstrates how you can manage different authentication credentials for the same server.</p>
<h3 id="heading-our-architecture">Our Architecture</h3>
<pre><code class="lang-mermaid">graph TD
    subgraph "AWS Cloud"
        EC2["Amazon Linux EC2 Instance"]
        SG["Security Group"]
        Auth["~/.ssh/authorized_keys"]
        F2B["fail2ban"]
    end

    subgraph "Local Machine"
        Key1["SSH Key Pair 1"]
        Key2["SSH Key Pair 2"] 
        Config["SSH Config File"]
        Client["SSH Client"]
    end

    SG --&gt;|"Allows Port 22"| EC2
    Key1 --&gt;|"Public Key"| Auth
    Key2 --&gt;|"Public Key"| Auth
    Auth --&gt;|"Authenticates"| EC2
    Client --&gt;|"SSH Connection"| EC2
    Config --&gt;|"Configures"| Client
    F2B --&gt;|"Protects"| EC2

    style EC2 fill:#f9f,stroke:#333,stroke-width:2px
    style SG fill:#bbf,stroke:#333,stroke-width:1px
    style Key1 fill:#bfb,stroke:#333,stroke-width:1px
    style Key2 fill:#bfb,stroke:#333,stroke-width:1px
    style F2B fill:#f66,stroke:#333,stroke-width:1px
</code></pre>
<h2 id="heading-step-1-provisioning-the-aws-ec2-instance">Step 1: Provisioning the AWS EC2 Instance</h2>
<p>The first step is to create your virtual server in the AWS cloud. EC2 (Elastic Compute Cloud) provides scalable computing capacity in the AWS cloud.</p>
<pre><code class="lang-mermaid">sequenceDiagram
    participant User
    participant AWS as AWS Console
    participant EC2 as EC2 Instance

    User-&gt;&gt;AWS: Create EC2 Instance
    AWS-&gt;&gt;EC2: Launch Amazon Linux
    AWS-&gt;&gt;EC2: Configure Security Group
    AWS-&gt;&gt;User: Provide Initial Key Pair
    User-&gt;&gt;EC2: Initial SSH Connection
    Note over User,EC2: Using AWS-provided key pair
</code></pre>
<h3 id="heading-step-by-step-ec2-setup">Step-by-Step EC2 Setup:</h3>
<ol>
<li><strong>Log in to your AWS Console</strong> and navigate to the EC2 dashboard</li>
<li><p><strong>Launch a new instance</strong> with the following specifications:</p>
<ul>
<li><strong>AMI</strong>: Amazon Linux (a Linux distribution optimized for AWS)</li>
<li><strong>Instance Type</strong>: t2.micro (free tier eligible)</li>
<li><strong>Security Group</strong>: Create a new one with port 22 (SSH) open</li>
<li><strong>Key Pair</strong>: Create or select an existing key pair for initial access</li>
</ul>
</li>
<li><p><strong>Connect to your instance</strong> using the AWS-provided key:</p>
<pre><code class="lang-bash">ssh -i ~/.ssh/aws_key.pem ec2-user@your-instance-ip
</code></pre>
</li>
</ol>
<blockquote>
<p>💡 <strong>Why Amazon Linux?</strong> Amazon Linux is optimized for AWS, includes AWS tools by default, and receives regular security updates directly from Amazon. This makes it an excellent choice for AWS-hosted servers.</p>
</blockquote>
<h2 id="heading-step-2-understanding-and-generating-ssh-key-pairs">Step 2: Understanding and Generating SSH Key Pairs</h2>
<p>SSH key pairs are the cryptographic credentials that allow secure authentication without passwords. Let's generate two separate key pairs for our server:</p>
<pre><code class="lang-mermaid">graph LR
    A["ssh-keygen command"] --&gt; B["~/.ssh/my_first_key (Private)"]
    A --&gt; C["~/.ssh/my_first_key.pub (Public)"]
    A --&gt; D["~/.ssh/my_second_key (Private)"]
    A --&gt; E["~/.ssh/my_second_key.pub (Public)"]

    style B fill:#f96,stroke:#333
    style D fill:#f96,stroke:#333
    style C fill:#9f6,stroke:#333
    style E fill:#9f6,stroke:#333
</code></pre>
<h3 id="heading-generating-ssh-keys-from-first-principles">Generating SSH Keys from First Principles:</h3>
<p>The <code>ssh-keygen</code> utility creates a mathematical key pair using public key cryptography algorithms. When generating keys, consider:</p>
<ol>
<li><strong>Key Type</strong>: RSA is widely supported, but newer algorithms like Ed25519 offer better security with smaller key sizes</li>
<li><strong>Key Size</strong>: For RSA keys, 4096 bits provides strong security</li>
<li><strong>Passphrase</strong>: An optional extra layer of security that encrypts your private key</li>
</ol>
<h3 id="heading-creating-our-keys">Creating Our Keys:</h3>
<pre><code class="lang-bash"><span class="hljs-comment"># Generate first key pair</span>
ssh-keygen -t rsa -b 4096 -f ~/.ssh/my_first_key -C <span class="hljs-string">"first-key"</span>

<span class="hljs-comment"># Generate second key pair</span>
ssh-keygen -t rsa -b 4096 -f ~/.ssh/my_second_key -C <span class="hljs-string">"second-key"</span>
</code></pre>
<p>Each command creates:</p>
<ul>
<li>A private key file (<code>my_first_key</code> or <code>my_second_key</code>)</li>
<li>A public key file with <code>.pub</code> extension</li>
</ul>
<blockquote>
<p>⚠️ <strong>Security Alert</strong>: Your private key files should NEVER be shared with anyone or committed to repositories. They should have permissions set to 600 (readable only by your user).</p>
</blockquote>
<h2 id="heading-step-3-server-side-ssh-configuration">Step 3: Server-Side SSH Configuration</h2>
<p>Now we'll configure our remote server to accept both SSH key pairs for authentication.</p>
<pre><code class="lang-mermaid">graph LR
    A["Local: Public Keys"] --&gt;|"Copy to Server"| B["Server: ~/.ssh/authorized_keys"]
    B --&gt;|"Permissions: 600"| C["SSH Server"]

    subgraph "Server Configuration"
        B
        D["~/.ssh directory&lt;br/&gt;Permissions: 700"]
    end

    style A fill:#bfb,stroke:#333
    style B fill:#bbf,stroke:#333
    style C fill:#f9f,stroke:#333
    style D fill:#bbf,stroke:#333
</code></pre>
<h3 id="heading-understanding-authorizedkeys-from-first-principles">Understanding authorized_keys from First Principles:</h3>
<p>The <code>authorized_keys</code> file contains a list of public keys that are allowed to authenticate. When an SSH client tries to connect:</p>
<ol>
<li>The server reads <code>authorized_keys</code></li>
<li>The client proves it has the corresponding private key</li>
<li>If proven, access is granted without a password</li>
</ol>
<h3 id="heading-adding-our-public-keys">Adding Our Public Keys:</h3>
<ol>
<li><p><strong>Display your public keys</strong> on your local machine:</p>
<pre><code class="lang-bash">cat ~/.ssh/my_first_key.pub
cat ~/.ssh/my_second_key.pub
</code></pre>
</li>
<li><p><strong>Add to authorized_keys</strong> on the server:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># On your remote server</span>
nano ~/.ssh/authorized_keys

<span class="hljs-comment"># Paste both public keys, each on its own line</span>
<span class="hljs-comment"># Save and exit (Ctrl+X, Y, Enter in nano)</span>
</code></pre>
</li>
<li><p><strong>Set proper permissions</strong>:</p>
<pre><code class="lang-bash">chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys
</code></pre>
</li>
</ol>
<blockquote>
<p>💡 <strong>Why these permissions?</strong> The SSH daemon is highly security-conscious and will refuse to use keys if the permissions are too open. These permission settings ensure only the owner can read or modify the keys.</p>
</blockquote>
<h2 id="heading-step-4-configuring-your-local-ssh-client">Step 4: Configuring Your Local SSH Client</h2>
<p>To simplify connections, we'll create an SSH config file that specifies connection details and key locations.</p>
<pre><code class="lang-mermaid">graph LR
    A["~/.ssh/config file"] --&gt;|"Contains"| B["Host alias configuration"]
    B --&gt;|"Specifies"| C["Connection details"]
    C --&gt;|"Includes"| D["HostName (IP)"]
    C --&gt;|"Includes"| E["User"]
    C --&gt;|"Includes"| F["IdentityFile paths"]

    G["ssh roadmapsh-test-server command"] --&gt;|"Uses"| A
    G --&gt;|"Connects to"| H["Remote Server"]

    style A fill:#bbf,stroke:#333
    style G fill:#bfb,stroke:#333
    style H fill:#f9f,stroke:#333
</code></pre>
<h3 id="heading-ssh-config-from-first-principles">SSH Config from First Principles:</h3>
<p>The SSH config file allows you to:</p>
<ul>
<li>Create shortcuts for complex connection commands</li>
<li>Specify different keys for different servers</li>
<li>Set server-specific options</li>
</ul>
<h3 id="heading-creating-the-config-file">Creating the Config File:</h3>
<pre><code class="lang-bash"><span class="hljs-comment"># On your local machine</span>
nano ~/.ssh/config
</code></pre>
<p>Add these lines:</p>
<pre><code>Host roadmapsh-test-server
    HostName your-instance-ip
    User ec2-user
    IdentityFile ~<span class="hljs-regexp">/.ssh/my</span>_first_key
    IdentityFile ~<span class="hljs-regexp">/.ssh/my</span>_second_key
</code></pre><p>Save and exit. Now you can connect with:</p>
<pre><code class="lang-bash">ssh roadmapsh-test-server
</code></pre>
<p>SSH will automatically try each key in order until one works.</p>
<blockquote>
<p>💡 <strong>Pro Tip</strong>: You can include additional options like <code>ServerAliveInterval 60</code> to keep connections alive or <code>Compression yes</code> to speed up connections over slow networks.</p>
</blockquote>
<h2 id="heading-step-5-enhancing-security-with-fail2ban">Step 5: Enhancing Security with fail2ban</h2>
<p>fail2ban is a security tool that monitors log files and automatically blocks IP addresses that show malicious signs, such as multiple failed login attempts.</p>
<pre><code class="lang-mermaid">flowchart TD
    A["Server Logs"] --&gt;|"Monitored by"| B["fail2ban"]
    B --&gt;|"Detects"| C["Suspicious Activity"]
    C --&gt;|"Triggers"| D["Ban Action"]
    D --&gt;|"Updates"| E["Firewall Rules"]
    E --&gt;|"Blocks"| F["Attacking IP"]

    style A fill:#bbf,stroke:#333
    style B fill:#f9f,stroke:#333
    style C fill:#f66,stroke:#333
    style F fill:#f66,stroke:#333
</code></pre>
<h3 id="heading-how-fail2ban-works-from-first-principles">How fail2ban Works from First Principles:</h3>
<p>fail2ban:</p>
<ol>
<li>Constantly monitors log files like <code>/var/log/secure</code></li>
<li>Uses regular expressions to detect patterns like failed login attempts</li>
<li>Maintains counters for each IP address</li>
<li>When thresholds are exceeded, it updates firewall rules to block the IP</li>
<li>After a configurable ban time, it removes the block</li>
</ol>
<h3 id="heading-installing-and-configuring-fail2ban">Installing and Configuring fail2ban:</h3>
<ol>
<li><p><strong>Install fail2ban</strong>:</p>
<pre><code class="lang-bash">sudo yum update -y
sudo yum install fail2ban -y
</code></pre>
</li>
<li><p><strong>Create a custom configuration</strong>:</p>
<pre><code class="lang-bash">sudo cp /etc/fail2ban/jail.conf /etc/fail2ban/jail.local
sudo nano /etc/fail2ban/jail.local
</code></pre>
</li>
<li><p><strong>Configure the SSH jail</strong> by ensuring these settings:</p>
<pre><code class="lang-ini"><span class="hljs-section">[sshd]</span>
<span class="hljs-attr">enabled</span> = <span class="hljs-literal">true</span>
<span class="hljs-attr">port</span> = ssh
<span class="hljs-attr">filter</span> = sshd
<span class="hljs-attr">logpath</span> = /var/log/secure
<span class="hljs-attr">maxretry</span> = <span class="hljs-number">3</span>
<span class="hljs-attr">bantime</span> = <span class="hljs-number">3600</span>  <span class="hljs-comment"># 1 hour in seconds</span>
</code></pre>
</li>
<li><p><strong>Start and enable fail2ban</strong>:</p>
<pre><code class="lang-bash">sudo systemctl start fail2ban
sudo systemctl <span class="hljs-built_in">enable</span> fail2ban
</code></pre>
</li>
<li><p><strong>Verify it's working</strong>:</p>
<pre><code class="lang-bash">sudo fail2ban-client status sshd
</code></pre>
</li>
</ol>
<blockquote>
<p>💡 <strong>Understanding fail2ban jails</strong>: Each "jail" is a separate configuration for a specific service (like SSH). You can have different settings for different services, allowing fine-grained control over security policies.</p>
</blockquote>
<h2 id="heading-testing-our-setup">Testing Our Setup</h2>
<p>Let's verify that our setup works correctly:</p>
<pre><code class="lang-mermaid">sequenceDiagram
    participant User
    participant LocalSSH as Local SSH Client
    participant RemoteSSH as Remote SSH Server
    participant F2B as fail2ban

    User-&gt;&gt;LocalSSH: ssh -i ~/.ssh/my_first_key ec2-user@IP
    LocalSSH-&gt;&gt;RemoteSSH: Authenticate with first key
    RemoteSSH-&gt;&gt;User: Successful login

    User-&gt;&gt;LocalSSH: ssh -i ~/.ssh/my_second_key ec2-user@IP
    LocalSSH-&gt;&gt;RemoteSSH: Authenticate with second key
    RemoteSSH-&gt;&gt;User: Successful login

    User-&gt;&gt;LocalSSH: ssh roadmapsh-test-server
    LocalSSH-&gt;&gt;RemoteSSH: Try first key, then second key
    RemoteSSH-&gt;&gt;User: Successful login

    User-&gt;&gt;LocalSSH: Attempt with wrong key (3 times)
    LocalSSH-&gt;&gt;RemoteSSH: Failed authentication
    RemoteSSH-&gt;&gt;F2B: Log failed attempts
    F2B-&gt;&gt;RemoteSSH: Ban IP after 3 failures
</code></pre>
<h3 id="heading-test-cases">Test Cases:</h3>
<ol>
<li><p><strong>Connect with the first key</strong>:</p>
<pre><code class="lang-bash">ssh -i ~/.ssh/my_first_key ec2-user@your-instance-ip
</code></pre>
</li>
<li><p><strong>Connect with the second key</strong>:</p>
<pre><code class="lang-bash">ssh -i ~/.ssh/my_second_key ec2-user@your-instance-ip
</code></pre>
</li>
<li><p><strong>Connect with the alias</strong>:</p>
<pre><code class="lang-bash">ssh roadmapsh-test-server
</code></pre>
</li>
<li><p><strong>Test fail2ban</strong> (optional, and with caution):
On a different machine, attempt to connect with incorrect credentials multiple times. Then check if your IP is banned:</p>
<pre><code class="lang-bash">sudo fail2ban-client status sshd
</code></pre>
</li>
</ol>
<h2 id="heading-security-best-practices">Security Best Practices</h2>
<p>Throughout this tutorial, we've implemented several security best practices. Here's a summary:</p>
<h3 id="heading-key-management-best-practices">Key Management Best Practices</h3>
<ol>
<li><strong>Use strong keys</strong>: 4096-bit RSA or Ed25519 keys</li>
<li><strong>Protect private keys</strong>: Set permissions to 600 and never share them</li>
<li><strong>Use passphrases</strong>: Add an extra layer of protection to private keys</li>
<li><strong>Rotate keys periodically</strong>: Generate new keys and remove old ones</li>
</ol>
<h3 id="heading-server-configuration-best-practices">Server Configuration Best Practices</h3>
<ol>
<li><p><strong>Disable password authentication</strong> in <code>/etc/ssh/sshd_config</code>:</p>
<pre><code>PasswordAuthentication no
</code></pre></li>
<li><p><strong>Limit user access</strong> by specifying allowed users:</p>
<pre><code>AllowUsers ec2-user
</code></pre></li>
<li><p><strong>Change the default SSH port</strong> to reduce automated attacks:</p>
<pre><code>Port <span class="hljs-number">2222</span>  # Example alternative port
</code></pre></li>
<li><p><strong>Use security groups/firewalls</strong> to restrict IP ranges that can access SSH</p>
</li>
<li><p><strong>Keep your system updated</strong> with regular security patches:</p>
<pre><code class="lang-bash">sudo yum update -y
</code></pre>
</li>
</ol>
<h2 id="heading-conclusion">Conclusion</h2>
<p>You've now set up a secure remote server access system using multiple SSH keys and enhanced protection with fail2ban. This approach provides:</p>
<ol>
<li><strong>Strong authentication</strong> without passwords</li>
<li><strong>Multiple access credentials</strong> that can be managed separately</li>
<li><strong>Protection against brute force attacks</strong></li>
<li><strong>Simplified connection</strong> through SSH config</li>
</ol>
<p>This knowledge forms the foundation of secure server administration and can be applied to any Linux-based server, not just AWS EC2 instances.</p>
<h2 id="heading-additional-resources">Additional Resources</h2>
<ul>
<li><a target="_blank" href="https://www.openssh.com/manual.html">OpenSSH Documentation</a></li>
<li><a target="_blank" href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/">AWS EC2 User Guide</a></li>
<li><a target="_blank" href="https://www.fail2ban.org/wiki/index.php/Main_Page">fail2ban Documentation</a></li>
<li><a target="_blank" href="https://www.ssh.com/academy/">SSH Academy</a></li>
<li><a target="_blank" href="https://linux-audit.com/linux-server-hardening-most-important-steps-to-secure-your-system/">Linux Security Best Practices</a></li>
</ul>
<p>Github repo [https://github.com/kaalpanikh/ssh-remote-server-setup]</p>
<hr />
<p><em>Have you set up remote SSH access before? What other security measures do you implement on your servers? Share your experience in the comments below!</em></p>
]]></content:encoded></item><item><title><![CDATA[n8n on Azure Kubernetes Service: Benefits and Advanced Enhancements]]></title><description><![CDATA[This is Part 8 of the "Building a Production-Ready n8n Workflow Automation Platform on Azure Kubernetes Service" series. View the complete series here.
Conclusion and Next Steps
Welcome to the final part of our n8n on AKS series! Throughout the previ...]]></description><link>https://nikhilmishra.xyz/n8n-on-azure-kubernetes-service-benefits-and-advanced-enhancements</link><guid isPermaLink="true">https://nikhilmishra.xyz/n8n-on-azure-kubernetes-service-benefits-and-advanced-enhancements</guid><category><![CDATA[n8n]]></category><category><![CDATA[Benefits]]></category><category><![CDATA[Workflow Automation]]></category><category><![CDATA[ROI (Return on Investment)]]></category><category><![CDATA[aks]]></category><category><![CDATA[advantages]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Sun, 02 Mar 2025 09:24:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740907254719/ac0f11f7-ca27-4273-9129-d26df599d617.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This is Part 8 of the "Building a Production-Ready n8n Workflow Automation Platform on Azure Kubernetes Service" series. <a target="_blank" href="https://nikhilmishra.live/series/n8n-azure-k8s">View the complete series here</a>.</em></p>
<h1 id="heading-conclusion-and-next-steps">Conclusion and Next Steps</h1>
<p>Welcome to the final part of our n8n on AKS series! Throughout the previous seven articles, we've built a complete production-grade n8n deployment. Let's summarize what we've accomplished and explore future possibilities.</p>
<h2 id="heading-project-summary">Project Summary</h2>
<p>In this blog post, we've walked through the complete process of deploying a production-ready n8n workflow automation platform on Azure Kubernetes Service (AKS). Let's recap what we've accomplished:</p>
<ol>
<li><p><strong>Established First Principles</strong>: We started by understanding the fundamental requirements of a production workflow system: data persistence, execution reliability, security, scalability, and maintainability.</p>
</li>
<li><p><strong>Designed a Robust Architecture</strong>: Using these principles, we designed a comprehensive architecture with distinct layers:</p>
<ul>
<li>Data Layer (PostgreSQL and Redis)</li>
<li>Application Layer (n8n main and workers)</li>
<li>External Access Layer (Ingress and SSL/TLS)</li>
</ul>
</li>
<li><p><strong>Implemented Core Components</strong>:</p>
<ul>
<li>AKS cluster with proper resource allocation</li>
<li>PostgreSQL database with persistence and proper user access</li>
<li>Redis queue for reliable workflow distribution</li>
<li>n8n main instance for UI and API access</li>
<li>Worker nodes for distributed workflow execution</li>
</ul>
</li>
<li><p><strong>Secured the Deployment</strong>:</p>
<ul>
<li>Kubernetes secrets for sensitive credentials</li>
<li>SSL/TLS encryption with automatic certificate management</li>
<li>Proper service isolation and network security</li>
</ul>
</li>
<li><p><strong>Added Production Features</strong>:</p>
<ul>
<li>Horizontal scaling for worker nodes</li>
<li>Monitoring and alerting setup</li>
<li>Backup and disaster recovery procedures</li>
<li>Maintenance and update strategies</li>
</ul>
</li>
<li><p><strong>Provided Troubleshooting Guidance</strong>:</p>
<ul>
<li>Common issues and resolution approaches</li>
<li>Diagnostic procedures for each component</li>
<li>Tools and scripts for efficient problem-solving</li>
</ul>
</li>
</ol>
<h2 id="heading-architecture-benefits">Architecture Benefits</h2>
<p>Our implementation provides several key benefits:</p>
<h3 id="heading-scalability">Scalability</h3>
<ul>
<li><strong>Horizontal Scaling</strong>: Worker nodes automatically scale based on demand</li>
<li><strong>Resource Efficiency</strong>: Components scaled according to their specific needs</li>
<li><strong>Growth Potential</strong>: Architecture can handle increasing workflow complexity and volume</li>
</ul>
<h3 id="heading-reliability">Reliability</h3>
<ul>
<li><strong>High Availability</strong>: Multiple nodes prevent single points of failure</li>
<li><strong>Resilient Execution</strong>: Queue-based processing ensures workflows run reliably</li>
<li><strong>Data Durability</strong>: Persistent storage with backup strategies</li>
</ul>
<h3 id="heading-security">Security</h3>
<ul>
<li><strong>Encrypted Communication</strong>: SSL/TLS for all external traffic</li>
<li><strong>Secure Credentials</strong>: Kubernetes secrets for sensitive data</li>
<li><strong>Isolation</strong>: Proper network controls and service separation</li>
</ul>
<h3 id="heading-maintainability">Maintainability</h3>
<ul>
<li><strong>Kubernetes Native</strong>: Leveraging Kubernetes features for updates and rollbacks</li>
<li><strong>Monitoring Integration</strong>: Comprehensive visibility into system health</li>
<li><strong>Documented Procedures</strong>: Clear processes for common maintenance tasks</li>
</ul>
<h2 id="heading-business-value">Business Value</h2>
<p>This n8n deployment delivers significant business value:</p>
<ol>
<li><strong>Automation Capabilities</strong>: Enables complex workflow automation across various business systems</li>
<li><strong>Reduced Manual Work</strong>: Eliminates repetitive tasks through reliable automation</li>
<li><strong>Integration Hub</strong>: Connects disparate systems without custom development</li>
<li><strong>Data Control</strong>: Self-hosted solution keeps sensitive data within your control</li>
<li><strong>Cost Efficiency</strong>: Right-sized infrastructure with optimization strategies</li>
<li><strong>Scalable Foundation</strong>: Grows with your automation needs</li>
</ol>
<h2 id="heading-key-metrics-and-performance">Key Metrics and Performance</h2>
<p>Our n8n deployment achieves impressive performance metrics:</p>
<ul>
<li><strong>Worker Scalability</strong>: 1-5 worker nodes based on demand</li>
<li><strong>Concurrent Workflows</strong>: Support for 50+ concurrent workflow executions</li>
<li><strong>Database Performance</strong>: Optimized PostgreSQL capable of handling 1000+ workflow definitions</li>
<li><strong>API Responsiveness</strong>: Sub-100ms response times for UI and API operations</li>
<li><strong>High Availability</strong>: 99.9% uptime through redundant components</li>
</ul>
<h2 id="heading-cost-analysis">Cost Analysis</h2>
<p>The deployed solution maintains a balance between performance and cost:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Component</td><td>Monthly Cost</td></tr>
</thead>
<tbody>
<tr>
<td>AKS Nodes (2 × D2s v3)</td><td>$140.16</td></tr>
<tr>
<td>Storage (Premium SSD, 64GB)</td><td>$10.44</td></tr>
<tr>
<td>Networking (Load Balancer)</td><td>$23.00</td></tr>
<tr>
<td>Monitoring</td><td>$7.50</td></tr>
<tr>
<td>Backups</td><td>$5.20</td></tr>
<tr>
<td><strong>Total</strong></td><td><strong>$186.30</strong></td></tr>
</tbody>
</table>
</div><p>These costs could be further optimized for development or testing environments.</p>
<h2 id="heading-next-steps-and-further-improvements">Next Steps and Further Improvements</h2>
<p>While our implementation is production-ready, several enhancements could be considered:</p>
<h3 id="heading-1-advanced-security-features">1. Advanced Security Features</h3>
<ul>
<li><strong>Azure AD Integration</strong>: Add Azure Active Directory integration for n8n authentication</li>
<li><strong>Private Endpoints</strong>: Configure private endpoints for Azure resources</li>
<li><strong>Network Policies</strong>: Implement Kubernetes network policies for granular traffic control</li>
<li><strong>Secret Rotation</strong>: Set up automated rotation of database and encryption credentials</li>
</ul>
<h3 id="heading-2-enhanced-scalability">2. Enhanced Scalability</h3>
<ul>
<li><strong>Global Distribution</strong>: Deploy across multiple regions for geographic redundancy</li>
<li><strong>Read Replicas</strong>: Add PostgreSQL read replicas for query-heavy workflows</li>
<li><strong>Specialized Node Pools</strong>: Create dedicated node pools for specific workflow types</li>
</ul>
<h3 id="heading-3-operational-improvements">3. Operational Improvements</h3>
<ul>
<li><strong>Automated Testing</strong>: Implement CI/CD pipelines for n8n workflows</li>
<li><strong>Custom Metrics</strong>: Develop workflow-specific metrics and dashboards</li>
<li><strong>Cost Optimization</strong>: Further refine resource allocation based on usage patterns</li>
<li><strong>Chaos Testing</strong>: Conduct chaos engineering exercises to improve resilience</li>
</ul>
<h3 id="heading-4-integration-enhancements">4. Integration Enhancements</h3>
<ul>
<li><strong>Managed Identity</strong>: Use Azure Managed Identity for secure Azure service connections</li>
<li><strong>API Management</strong>: Add Azure API Management for better API governance</li>
<li><strong>Logic Apps Bridge</strong>: Create bridges between n8n and Azure Logic Apps for hybrid workflows</li>
</ul>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Congratulations! You've completed this comprehensive journey to deploying n8n on Azure Kubernetes Service. You now have a production-ready workflow automation platform that is scalable, reliable, secure, and maintainable.</p>
<p>I hope this series has provided valuable insights not just into n8n and AKS specifically, but also into the broader principles of designing and implementing production systems on Kubernetes. The approach we've taken—starting from first principles and building up a complete solution—can be applied to many other applications and scenarios.</p>
<p>Thank you for following along! If you have questions or want to share your own experiences with n8n or Kubernetes deployments, please leave a comment below.</p>
<p><em>Did you find this series helpful? Consider sharing it with colleagues who might benefit from this knowledge.</em></p>
<hr />
<p>What workflow automation use cases are you implementing or planning to implement with n8n? What other tools would you like to see deployed on Kubernetes using this approach? Share your thoughts in the comments!</p>
]]></content:encoded></item><item><title><![CDATA[Troubleshooting n8n on Kubernetes: Problems and Solutions Guide]]></title><description><![CDATA[This is Part 7 of the "Building a Production-Ready n8n Workflow Automation Platform on Azure Kubernetes Service" series. View the complete series here.
Troubleshooting and Problem Resolution
Welcome to Part 7 of our n8n on AKS series! In Part 6, we i...]]></description><link>https://nikhilmishra.xyz/troubleshooting-n8n-on-kubernetes-problems-and-solutions-guide</link><guid isPermaLink="true">https://nikhilmishra.xyz/troubleshooting-n8n-on-kubernetes-problems-and-solutions-guide</guid><category><![CDATA[troubleshooting]]></category><category><![CDATA[n8n]]></category><category><![CDATA[kubernetes debugging]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[k8s]]></category><category><![CDATA[workflow]]></category><category><![CDATA[Workflow Automation]]></category><category><![CDATA[Issues]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Sun, 02 Mar 2025 09:17:25 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740906960339/7ccd6e0d-90fc-469f-8224-98a9c56674d3.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This is Part 7 of the "Building a Production-Ready n8n Workflow Automation Platform on Azure Kubernetes Service" series. <a target="_blank" href="https://nikhilmishra.live/series/n8n-azure-k8s">View the complete series here</a>.</em></p>
<h1 id="heading-troubleshooting-and-problem-resolution">Troubleshooting and Problem Resolution</h1>
<p>Welcome to Part 7 of our n8n on AKS series! In <a class="post-section-overview" href="#part6-link">Part 6</a>, we implemented monitoring and optimization strategies. Even with the best preparation, issues can arise, so today we'll explore comprehensive troubleshooting techniques.</p>
<p>Even the most carefully designed systems encounter issues. This section provides a comprehensive guide to troubleshooting common problems you might encounter with your n8n deployment on AKS.</p>
<h2 id="heading-common-issues-and-resolutions">Common Issues and Resolutions</h2>
<h3 id="heading-database-connection-issues">Database Connection Issues</h3>
<p><strong>Symptoms:</strong></p>
<ul>
<li>n8n pods showing errors like <code>Error: connect ETIMEDOUT</code> or <code>Error: connect ECONNREFUSED</code></li>
<li>Database-related error messages in n8n logs</li>
<li>n8n UI showing database connection errors</li>
</ul>
<p><strong>Diagnostic Approach:</strong></p>
<ol>
<li><p>Check PostgreSQL pod status:</p>
<pre><code class="lang-bash">kubectl get pods -n n8n -l app=postgres
</code></pre>
</li>
<li><p>Verify PostgreSQL service:</p>
<pre><code class="lang-bash">kubectl get svc postgres-service -n n8n
</code></pre>
</li>
<li><p>Check database logs:</p>
<pre><code class="lang-bash">kubectl logs $(kubectl get pod -l app=postgres -n n8n -o jsonpath=<span class="hljs-string">'{.items[0].metadata.name}'</span>) -n n8n
</code></pre>
</li>
<li><p>Test database connection from n8n pod:</p>
<pre><code class="lang-bash">kubectl <span class="hljs-built_in">exec</span> -it $(kubectl get pod -l app=n8n -n n8n -o jsonpath=<span class="hljs-string">'{.items[0].metadata.name}'</span>) -n n8n -- \
  node -e <span class="hljs-string">"const { Pool } = require('pg'); const pool = new Pool({host: 'postgres-service', user: process.env.DB_POSTGRESDB_USER, password: process.env.DB_POSTGRESDB_PASSWORD, database: 'n8n'}); pool.query('SELECT NOW()', (err, res) =&gt; { console.log(err || res.rows[0]); pool.end(); })"</span>
</code></pre>
</li>
</ol>
<p><strong>Common Resolutions:</strong></p>
<ol>
<li><p><strong>Authentication Issues</strong>:</p>
<ul>
<li>Verify the PostgreSQL credentials in the Kubernetes secrets</li>
<li>Ensure the n8n database user exists and has proper permissions</li>
</ul>
</li>
<li><p><strong>Network Issues</strong>:</p>
<ul>
<li>Check if pods are in the same namespace</li>
<li>Verify that the service name resolution works</li>
<li>Ensure no network policies are blocking the connection</li>
</ul>
</li>
<li><p><strong>Database Health Issues</strong>:</p>
<ul>
<li>Check for PostgreSQL resource constraints</li>
<li>Verify the database isn't in recovery mode</li>
<li>Check for disk space issues</li>
</ul>
</li>
</ol>
<h3 id="heading-queueredis-connection-issues">Queue/Redis Connection Issues</h3>
<p><strong>Symptoms:</strong></p>
<ul>
<li>Workflows are triggered but stay in "waiting" status</li>
<li>Error messages like <code>Error connecting to Redis</code> in n8n logs</li>
<li>Workers not processing queued workflows</li>
</ul>
<p><strong>Diagnostic Approach:</strong></p>
<ol>
<li><p>Check Redis pod status:</p>
<pre><code class="lang-bash">kubectl get pods -n n8n -l app=redis
</code></pre>
</li>
<li><p>Verify Redis service:</p>
<pre><code class="lang-bash">kubectl get svc redis-service -n n8n
</code></pre>
</li>
<li><p>Check Redis logs:</p>
<pre><code class="lang-bash">kubectl logs $(kubectl get pod -l app=redis -n n8n -o jsonpath=<span class="hljs-string">'{.items[0].metadata.name}'</span>) -n n8n
</code></pre>
</li>
<li><p>Test Redis connection from n8n pod:</p>
<pre><code class="lang-bash">kubectl <span class="hljs-built_in">exec</span> -it $(kubectl get pod -l app=n8n -n n8n -o jsonpath=<span class="hljs-string">'{.items[0].metadata.name}'</span>) -n n8n -- \
  node -e <span class="hljs-string">"const Redis = require('ioredis'); const redis = new Redis('redis-service'); redis.ping().then(res =&gt; { console.log(res); redis.disconnect(); })"</span>
</code></pre>
</li>
</ol>
<p><strong>Common Resolutions:</strong></p>
<ol>
<li><p><strong>Connection Configuration</strong>:</p>
<ul>
<li>Verify Redis host and port settings in n8n environment variables</li>
<li>Check if the Redis service name is correctly specified</li>
</ul>
</li>
<li><p><strong>Queue Stuck Issues</strong>:</p>
<ul>
<li>Clear stuck queues with Redis CLI commands</li>
<li>Restart the Redis pod if necessary</li>
</ul>
</li>
<li><p><strong>Worker Configuration</strong>:</p>
<ul>
<li>Ensure workers are configured for queue mode</li>
<li>Verify workers have the same encryption key as the main n8n instance</li>
</ul>
</li>
</ol>
<h3 id="heading-ssltls-certificate-issues">SSL/TLS Certificate Issues</h3>
<p><strong>Symptoms:</strong></p>
<ul>
<li>Browser shows "Your connection is not private" warning</li>
<li>Certificate errors in browser console</li>
<li>Ingress controller logs showing certificate issues</li>
</ul>
<p><strong>Diagnostic Approach:</strong></p>
<ol>
<li><p>Check certificate status:</p>
<pre><code class="lang-bash">kubectl get certificate -n n8n
</code></pre>
</li>
<li><p>Examine certificate details:</p>
<pre><code class="lang-bash">kubectl describe certificate n8n-tls-secret -n n8n
</code></pre>
</li>
<li><p>Check cert-manager logs:</p>
<pre><code class="lang-bash">kubectl logs -n cert-manager -l app=cert-manager
</code></pre>
</li>
<li><p>Verify the ClusterIssuer status:</p>
<pre><code class="lang-bash">kubectl describe clusterissuer letsencrypt-prod
</code></pre>
</li>
</ol>
<p><strong>Common Resolutions:</strong></p>
<ol>
<li><p><strong>Domain Validation Issues</strong>:</p>
<ul>
<li>Ensure DNS is correctly configured to point to your ingress controller IP</li>
<li>Verify that the HTTP-01 challenge can reach your ingress controller</li>
<li>Check if Let's Encrypt rate limits have been hit (5 certificates per domain per week)</li>
</ul>
</li>
<li><p><strong>Configuration Issues</strong>:</p>
<ul>
<li>Verify email address in ClusterIssuer is valid</li>
<li>Ensure ingress class is correctly specified</li>
<li>Check TLS section in Ingress resource matches your domain</li>
</ul>
</li>
<li><p><strong>Certificate Renewal Issues</strong>:</p>
<ul>
<li>Manually trigger certificate renewal if needed</li>
<li>Check if cert-manager CRDs are up to date</li>
<li>Verify cert-manager has necessary permissions</li>
</ul>
</li>
</ol>
<h3 id="heading-ingress-and-external-access-issues">Ingress and External Access Issues</h3>
<p><strong>Symptoms:</strong></p>
<ul>
<li>Unable to access n8n UI from the internet</li>
<li>404, 502, or other HTTP errors when accessing your domain</li>
<li>Timeouts when attempting to connect</li>
</ul>
<p><strong>Diagnostic Approach:</strong></p>
<ol>
<li><p>Check Ingress resource status:</p>
<pre><code class="lang-bash">kubectl get ingress -n n8n
</code></pre>
</li>
<li><p>Verify Ingress controller pods:</p>
<pre><code class="lang-bash">kubectl get pods -n default -l app.kubernetes.io/name=ingress-nginx
</code></pre>
</li>
<li><p>Check Ingress controller logs:</p>
<pre><code class="lang-bash">kubectl logs -n default -l app.kubernetes.io/name=ingress-nginx
</code></pre>
</li>
<li><p>Test connectivity to Ingress IP:</p>
<pre><code class="lang-bash">curl -v http://&lt;ingress-ip&gt;
</code></pre>
</li>
</ol>
<p><strong>Common Resolutions:</strong></p>
<ol>
<li><p><strong>DNS Issues</strong>:</p>
<ul>
<li>Verify DNS A record points to the correct Ingress Controller IP</li>
<li>Check if DNS propagation is complete (may take up to 48 hours)</li>
<li>Use <code>nslookup</code> or <code>dig</code> to verify DNS resolution</li>
</ul>
</li>
<li><p><strong>Ingress Configuration</strong>:</p>
<ul>
<li>Ensure the Ingress resource specifies the correct service and port</li>
<li>Verify host rules match your domain exactly</li>
<li>Check path settings and ensure they match n8n requirements</li>
</ul>
</li>
<li><p><strong>Network Issues</strong>:</p>
<ul>
<li>Verify Azure Network Security Groups allow traffic on ports 80 and 443</li>
<li>Check if any firewalls are blocking access to your AKS cluster</li>
<li>Ensure the Ingress Controller service is of type LoadBalancer with an external IP</li>
</ul>
</li>
</ol>
<h2 id="heading-troubleshooting-decision-tree">Troubleshooting Decision Tree</h2>
<p>The following diagram presents a structured approach to troubleshooting n8n deployment issues:</p>
<pre><code class="lang-mermaid">flowchart TD
    start[Issue Detected] --&gt; issue{What type of issue?}

    issue --&gt;|UI/Access| ui[UI or Access Issue]
    issue --&gt;|Workflow Execution| exec[Workflow Execution Issue]
    issue --&gt;|Infrastructure| infra[Infrastructure Issue]

    ui --&gt; uiDiag{UI Diagnostic}
    uiDiag --&gt;|Cannot reach site| dns[Check DNS &amp; Ingress]
    uiDiag --&gt;|SSL Error| cert[Check Certificate]
    uiDiag --&gt;|Login Issue| auth[Check Authentication]
    uiDiag --&gt;|UI Loads but Errors| n8nui[Check n8n Logs]

    dns --&gt; dnsFix[DNS &amp; Ingress Fixes]
    cert --&gt; certFix[Certificate Fixes]
    auth --&gt; authFix[Auth Fixes]
    n8nui --&gt; uiFix[UI Issue Fixes]

    exec --&gt; execDiag{Execution Diagnostic}
    execDiag --&gt;|Workflow Stuck in Queue| queue[Check Redis &amp; Workers]
    execDiag --&gt;|Database Errors| db[Check PostgreSQL]
    execDiag --&gt;|Execution Failures| node[Check Node Errors]

    queue --&gt; queueFix[Queue &amp; Worker Fixes]
    db --&gt; dbFix[Database Fixes]
    node --&gt; nodeFix[Node-specific Fixes]

    infra --&gt; infraDiag{Infrastructure Diagnostic}
    infraDiag --&gt;|Pod Issues| pod[Check Pod Status]
    infraDiag --&gt;|Resource Issues| res[Check Resource Usage]
    infraDiag --&gt;|Network Issues| net[Check Network Policy]

    pod --&gt; podFix[Pod Fixes]
    res --&gt; resFix[Resource Fixes]
    net --&gt; netFix[Network Fixes]

    style start fill:#f96,stroke:#333
    style dnsFix,certFix,authFix,uiFix fill:#9f6,stroke:#333
    style queueFix,dbFix,nodeFix fill:#9f6,stroke:#333
    style podFix,resFix,netFix fill:#9f6,stroke:#333
</code></pre>
<h2 id="heading-advanced-diagnostic-workflows">Advanced Diagnostic Workflows</h2>
<h3 id="heading-database-performance-issues">Database Performance Issues</h3>
<p>If workflow execution is slow or database operations are taking too long:</p>
<ol>
<li><p>Check database load:</p>
<pre><code class="lang-bash">kubectl <span class="hljs-built_in">exec</span> -it $(kubectl get pod -l app=postgres -n n8n -o jsonpath=<span class="hljs-string">'{.items[0].metadata.name}'</span>) -n n8n -- \
  psql -U postgres -c <span class="hljs-string">"SELECT * FROM pg_stat_activity WHERE state = 'active';"</span>
</code></pre>
</li>
<li><p>Identify long-running queries:</p>
<pre><code class="lang-bash">kubectl <span class="hljs-built_in">exec</span> -it $(kubectl get pod -l app=postgres -n n8n -o jsonpath=<span class="hljs-string">'{.items[0].metadata.name}'</span>) -n n8n -- \
  psql -U postgres -c <span class="hljs-string">"SELECT pid, now() - query_start AS duration, query FROM pg_stat_activity WHERE state = 'active' ORDER BY duration DESC;"</span>
</code></pre>
</li>
<li><p>Check table sizes and index usage:</p>
<pre><code class="lang-bash">kubectl <span class="hljs-built_in">exec</span> -it $(kubectl get pod -l app=postgres -n n8n -o jsonpath=<span class="hljs-string">'{.items[0].metadata.name}'</span>) -n n8n -- \
  psql -U postgres -d n8n -c <span class="hljs-string">"SELECT relname, pg_size_pretty(pg_total_relation_size(relid)) AS total_size FROM pg_catalog.pg_statio_user_tables ORDER BY pg_total_relation_size(relid) DESC;"</span>
</code></pre>
</li>
</ol>
<h3 id="heading-memory-leak-investigation">Memory Leak Investigation</h3>
<p>If n8n pods are steadily increasing in memory usage:</p>
<ol>
<li><p>Get memory usage metrics:</p>
<pre><code class="lang-bash">kubectl top pods -n n8n
</code></pre>
</li>
<li><p>Check container memory limit and usage:</p>
<pre><code class="lang-bash">kubectl describe pod $(kubectl get pod -l app=n8n -n n8n -o jsonpath=<span class="hljs-string">'{.items[0].metadata.name}'</span>) -n n8n
</code></pre>
</li>
<li><p>Generate a heap dump (for advanced debugging):</p>
<pre><code class="lang-bash">kubectl <span class="hljs-built_in">exec</span> -it $(kubectl get pod -l app=n8n -n n8n -o jsonpath=<span class="hljs-string">'{.items[0].metadata.name}'</span>) -n n8n -- \
  node --expose-gc -e <span class="hljs-string">"const fs=require('fs'); setTimeout(() =&gt; { global.gc(); const heapSnapshot = require('v8').getHeapSnapshot(); const file = fs.createWriteStream('/tmp/heap.json'); heapSnapshot.pipe(file); }, 1000);"</span>

kubectl cp n8n/$(kubectl get pod -l app=n8n -n n8n -o jsonpath=<span class="hljs-string">'{.items[0].metadata.name}'</span>):/tmp/heap.json ./heap.json
</code></pre>
</li>
<li><p>Analyze the heap dump with Chrome DevTools or a memory analyzer.</p>
</li>
</ol>
<h3 id="heading-network-connectivity-issues">Network Connectivity Issues</h3>
<p>If services can't communicate with each other:</p>
<ol>
<li><p>Test network connectivity between pods:</p>
<pre><code class="lang-bash">kubectl <span class="hljs-built_in">exec</span> -it $(kubectl get pod -l app=n8n -n n8n -o jsonpath=<span class="hljs-string">'{.items[0].metadata.name}'</span>) -n n8n -- \
  nc -zv postgres-service 5432

kubectl <span class="hljs-built_in">exec</span> -it $(kubectl get pod -l app=n8n -n n8n -o jsonpath=<span class="hljs-string">'{.items[0].metadata.name}'</span>) -n n8n -- \
  nc -zv redis-service 6379
</code></pre>
</li>
<li><p>Check if network policies are restricting traffic:</p>
<pre><code class="lang-bash">kubectl get networkpolicies -n n8n
</code></pre>
</li>
<li><p>Verify DNS resolution:</p>
<pre><code class="lang-bash">kubectl <span class="hljs-built_in">exec</span> -it $(kubectl get pod -l app=n8n -n n8n -o jsonpath=<span class="hljs-string">'{.items[0].metadata.name}'</span>) -n n8n -- \
  nslookup postgres-service

kubectl <span class="hljs-built_in">exec</span> -it $(kubectl get pod -l app=n8n -n n8n -o jsonpath=<span class="hljs-string">'{.items[0].metadata.name}'</span>) -n n8n -- \
  nslookup redis-service
</code></pre>
</li>
</ol>
<h2 id="heading-creating-a-diagnostic-information-bundle">Creating a Diagnostic Information Bundle</h2>
<p>For complex issues, it's often helpful to collect comprehensive diagnostic information:</p>
<pre><code class="lang-bash"><span class="hljs-meta">#!/bin/bash</span>
<span class="hljs-comment"># collect-diagnostics.sh - Collect diagnostic information for n8n deployment</span>

<span class="hljs-comment"># Create output directory</span>
mkdir -p n8n-diagnostics
<span class="hljs-built_in">cd</span> n8n-diagnostics

<span class="hljs-comment"># Collect pod information</span>
kubectl get pods -n n8n -o yaml &gt; pods.yaml
kubectl describe pods -n n8n &gt; pods-describe.txt

<span class="hljs-comment"># Collect logs</span>
<span class="hljs-keyword">for</span> pod <span class="hljs-keyword">in</span> $(kubectl get pods -n n8n -o jsonpath=<span class="hljs-string">'{.items[*].metadata.name}'</span>); <span class="hljs-keyword">do</span>
  kubectl logs <span class="hljs-variable">$pod</span> -n n8n &gt; <span class="hljs-variable">$pod</span>-logs.txt
<span class="hljs-keyword">done</span>

<span class="hljs-comment"># Collect service and endpoint information</span>
kubectl get svc,endpoints -n n8n -o yaml &gt; services.yaml

<span class="hljs-comment"># Collect ingress and certificate information</span>
kubectl get ingress,certificate -n n8n -o yaml &gt; ingress-cert.yaml
kubectl describe ingress,certificate -n n8n &gt; ingress-cert-describe.txt

<span class="hljs-comment"># Collect events</span>
kubectl get events -n n8n &gt; events.txt

<span class="hljs-comment"># Collect resource usage</span>
kubectl top pods -n n8n &gt; pod-resources.txt
kubectl top nodes &gt; node-resources.txt

<span class="hljs-comment"># Create a tar archive</span>
tar -czf n8n-diagnostics.tar.gz *

<span class="hljs-built_in">echo</span> <span class="hljs-string">"Diagnostic information collected in n8n-diagnostics.tar.gz"</span>
</code></pre>
<h2 id="heading-troubleshooting-cheatsheet">Troubleshooting Cheatsheet</h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Issue</td><td>Check Command</td><td>Resolution Strategy</td></tr>
</thead>
<tbody>
<tr>
<td>Pod won't start</td><td><code>kubectl describe pod &lt;pod-name&gt; -n n8n</code></td><td>Check events section for errors</td></tr>
<tr>
<td>Pod crashing</td><td><code>kubectl logs &lt;pod-name&gt; -n n8n</code></td><td>Look for error messages near the end</td></tr>
<tr>
<td>Service unavailable</td><td><code>kubectl get endpoints &lt;service-name&gt; -n n8n</code></td><td>Verify endpoints exist</td></tr>
<tr>
<td>Certificate issues</td><td><code>kubectl describe certificate &lt;cert-name&gt; -n n8n</code></td><td>Check events and conditions</td></tr>
<tr>
<td>Database connection</td><td>`kubectl exec -it  -n n8n -- env \</td><td>grep DB_`</td><td>Verify environment variables</td></tr>
<tr>
<td>Redis connection</td><td>`kubectl exec -it  -n n8n -- env \</td><td>grep REDIS`</td><td>Verify environment variables</td></tr>
<tr>
<td>Ingress not working</td><td><code>kubectl get ingress &lt;ingress-name&gt; -n n8n</code></td><td>Check ADDRESS field has an IP</td></tr>
<tr>
<td>Resource constraints</td><td><code>kubectl top pods -n n8n</code></td><td>Check for pods near resource limits</td></tr>
<tr>
<td>Webhook not triggering</td><td>`kubectl logs  -n n8n \</td><td>grep webhook`</td><td>Verify webhook URL and connectivity</td></tr>
</tbody>
</table>
</div><h2 id="heading-summary">Summary</h2>
<p>Troubleshooting a production n8n deployment on AKS requires a systematic approach. By understanding:</p>
<ol>
<li><strong>Common failure modes</strong> and their symptoms</li>
<li><strong>Diagnostic approaches</strong> for each component</li>
<li><strong>Resolution strategies</strong> for different issues</li>
</ol>
<p>You can quickly identify and resolve problems, minimizing downtime and ensuring a reliable workflow automation platform.</p>
<p>Remember that many issues can be prevented through proper monitoring and proactive maintenance, as discussed in the previous section. When problems do occur, having these troubleshooting procedures documented will significantly reduce the mean time to resolution.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Armed with these troubleshooting strategies and diagnostic workflows, you can quickly identify and resolve issues in your n8n deployment. Remember that systematic investigation and a good understanding of the architecture are key to efficient problem resolution.</p>
<p>In our final article, we'll summarize what we've accomplished, review the benefits of our architecture, and explore advanced enhancements for the future. [Continue to Part 8: Conclusion and Next Steps]</p>
<hr />
<p>What troubleshooting techniques have you found most effective for Kubernetes applications? Have you encountered any particularly challenging issues with workflow automation systems? Share your stories in the comments!</p>
]]></content:encoded></item><item><title><![CDATA[Monitoring and Optimizing n8n on Kubernetes: The Complete Guide]]></title><description><![CDATA[This is Part 6 of the "Building a Production-Ready n8n Workflow Automation Platform on Azure Kubernetes Service" series. View the complete series here.
Monitoring, Maintenance, and Optimization
A production-grade deployment requires robust monitoring...]]></description><link>https://nikhilmishra.xyz/monitoring-and-optimizing-n8n-on-kubernetes-the-complete-guide</link><guid isPermaLink="true">https://nikhilmishra.xyz/monitoring-and-optimizing-n8n-on-kubernetes-the-complete-guide</guid><category><![CDATA[n8n]]></category><category><![CDATA[Performance Monitoring]]></category><category><![CDATA[performance]]></category><category><![CDATA[monitoring]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[k8s]]></category><category><![CDATA[optimization]]></category><category><![CDATA[Workflow Automation]]></category><category><![CDATA[#kubernetes-maintainance]]></category><category><![CDATA[OptimizationStrategies]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Sun, 02 Mar 2025 09:11:59 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740906571781/04cdf6f2-a04c-4236-863d-0d3a87457aca.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This is Part 6 of the "Building a Production-Ready n8n Workflow Automation Platform on Azure Kubernetes Service" series. <a target="_blank" href="https://nikhilmishra.live/series/n8n-azure-k8s">View the complete series here</a>.</em></p>
<h1 id="heading-monitoring-maintenance-and-optimization">Monitoring, Maintenance, and Optimization</h1>
<p>A production-grade deployment requires robust monitoring, routine maintenance procedures, and performance optimization. In this section, we'll cover:</p>
<ol>
<li>Monitoring strategies for n8n on AKS</li>
<li>Maintenance procedures and best practices</li>
<li>Performance optimization techniques</li>
<li>Cost optimization approaches</li>
</ol>
<h2 id="heading-monitoring-your-n8n-deployment">Monitoring Your n8n Deployment</h2>
<h3 id="heading-key-metrics-to-monitor">Key Metrics to Monitor</h3>
<p>For an n8n deployment, several metrics are critical to track:</p>
<ol>
<li><p><strong>Application Health</strong>:</p>
<ul>
<li>Pod readiness and liveness</li>
<li>API response times</li>
<li>Error rates in logs</li>
<li>Webhook reliability</li>
</ul>
</li>
<li><p><strong>Infrastructure Metrics</strong>:</p>
<ul>
<li>CPU and memory usage across all components</li>
<li>Storage usage and growth rate</li>
<li>Network traffic patterns</li>
<li>Queue length and processing times</li>
</ul>
</li>
<li><p><strong>Database Performance</strong>:</p>
<ul>
<li>Query execution times</li>
<li>Connection pool utilization</li>
<li>Database size growth</li>
<li>Transaction rates</li>
</ul>
</li>
</ol>
<h3 id="heading-implementing-azure-monitor">Implementing Azure Monitor</h3>
<p>Azure Monitor provides comprehensive monitoring for AKS clusters. We implemented it with:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Enable Azure Monitor for container insights</span>
az aks enable-addons -a monitoring -n n8n-cluster -g n8n-aks-rg
</code></pre>
<p>This enables:</p>
<ul>
<li>Container metrics collection</li>
<li>Log aggregation</li>
<li>Performance dashboards</li>
<li>Alert configuration</li>
</ul>
<h3 id="heading-creating-custom-dashboards">Creating Custom Dashboards</h3>
<p>We created custom dashboards in Azure portal for n8n-specific metrics:</p>
<ol>
<li><p><strong>n8n Operations Dashboard</strong>:</p>
<ul>
<li>Workflow execution rates</li>
<li>Error percentages</li>
<li>API request volumes</li>
<li>Active user sessions</li>
</ul>
</li>
<li><p><strong>Infrastructure Health Dashboard</strong>:</p>
<ul>
<li>Pod status across namespaces</li>
<li>Node resource utilization</li>
<li>Storage consumption</li>
<li>Networking metrics</li>
</ul>
</li>
</ol>
<h3 id="heading-setting-up-alerts">Setting Up Alerts</h3>
<p>Critical alerts were configured for:</p>
<ol>
<li><p><strong>High Severity</strong>:</p>
<ul>
<li>Any pod in Failed or CrashLoopBackOff state</li>
<li>Database or Redis unavailability</li>
<li>Worker queue backlog exceeding thresholds</li>
<li>Certificate expiration warnings</li>
</ul>
</li>
<li><p><strong>Medium Severity</strong>:</p>
<ul>
<li>CPU or memory usage above 80% for over 15 minutes</li>
<li>Persistent storage approaching capacity</li>
<li>High error rates in application logs</li>
<li>Unusual traffic patterns (potential security issues)</li>
</ul>
</li>
</ol>
<h3 id="heading-log-management">Log Management</h3>
<p>For comprehensive log management, we configured:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">ConfigMap</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">fluentd-config</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">kube-system</span>
<span class="hljs-attr">data:</span>
  <span class="hljs-attr">fluent.conf:</span> <span class="hljs-string">|
    # Log collection and forwarding configuration
    # Details omitted for brevity</span>
</code></pre>
<p>This configuration:</p>
<ul>
<li>Collects container logs across the cluster</li>
<li>Enriches logs with metadata (namespace, pod name, etc.)</li>
<li>Forwards logs to Azure Log Analytics</li>
<li>Enables structured querying and analytics</li>
</ul>
<h2 id="heading-maintenance-procedures">Maintenance Procedures</h2>
<h3 id="heading-backup-and-disaster-recovery">Backup and Disaster Recovery</h3>
<p>We implemented a comprehensive backup strategy:</p>
<ol>
<li><strong>Database Backups</strong>:<ul>
<li>Daily full backups retained for 30 days</li>
<li>Point-in-time recovery capability</li>
<li>Geo-redundant storage for backups</li>
<li>Automated validation of backup integrity</li>
</ul>
</li>
</ol>
<p>Implementation using a CronJob:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">batch/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">CronJob</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">postgres-backup</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">n8n</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">schedule:</span> <span class="hljs-string">"0 2 * * *"</span>  <span class="hljs-comment"># Run daily at 2 AM</span>
  <span class="hljs-attr">jobTemplate:</span>
    <span class="hljs-attr">spec:</span>
      <span class="hljs-attr">template:</span>
        <span class="hljs-attr">spec:</span>
          <span class="hljs-attr">containers:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">postgres-backup</span>
            <span class="hljs-attr">image:</span> <span class="hljs-string">postgres:13</span>
            <span class="hljs-attr">command:</span> [<span class="hljs-string">"/bin/bash"</span>, <span class="hljs-string">"-c"</span>]
            <span class="hljs-attr">args:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-string">|
              pg_dump -h postgres-service -U n8n -d n8n | gzip &gt; /backups/n8n-$(date +%Y%m%d).sql.gz
              # Upload to Azure Blob Storage
              az storage blob upload --account-name n8nbackups --container-name backups --name n8n-$(date +%Y%m%d).sql.gz --file /backups/n8n-$(date +%Y%m%d).sql.gz
</span>            <span class="hljs-attr">volumeMounts:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">backup-volume</span>
              <span class="hljs-attr">mountPath:</span> <span class="hljs-string">/backups</span>
            <span class="hljs-attr">env:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">PGPASSWORD</span>
              <span class="hljs-attr">valueFrom:</span>
                <span class="hljs-attr">secretKeyRef:</span>
                  <span class="hljs-attr">name:</span> <span class="hljs-string">n8n-secret</span>
                  <span class="hljs-attr">key:</span> <span class="hljs-string">DB_POSTGRESDB_PASSWORD</span>
          <span class="hljs-attr">volumes:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">backup-volume</span>
            <span class="hljs-attr">emptyDir:</span> {}
          <span class="hljs-attr">restartPolicy:</span> <span class="hljs-string">OnFailure</span>
</code></pre>
<ol start="2">
<li><strong>Disaster Recovery Plan</strong>:<ul>
<li>Documented recovery procedures</li>
<li>Regular DR testing (quarterly)</li>
<li>Recovery time objective (RTO): 2 hours</li>
<li>Recovery point objective (RPO): 24 hours</li>
</ul>
</li>
</ol>
<h3 id="heading-update-strategy">Update Strategy</h3>
<p>For keeping the deployment up-to-date, we established:</p>
<ol>
<li><p><strong>n8n Version Updates</strong>:</p>
<ul>
<li>Monthly update schedule</li>
<li>Canary deployment approach (update one pod, validate, then update others)</li>
<li>Rollback procedures documented and tested</li>
</ul>
</li>
<li><p><strong>Kubernetes and Infrastructure Updates</strong>:</p>
<ul>
<li>Quarterly AKS version assessment</li>
<li>Security patches applied promptly</li>
<li>Node recycling strategy (one node at a time)</li>
</ul>
</li>
</ol>
<p>Update implementation with zero-downtime:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Update n8n with rolling deployment</span>
kubectl <span class="hljs-built_in">set</span> image deployment/n8n n8n=n8nio/n8n:new-version -n n8n

<span class="hljs-comment"># Wait for rollout to complete</span>
kubectl rollout status deployment/n8n -n n8n

<span class="hljs-comment"># If issues detected, rollback</span>
kubectl rollout undo deployment/n8n -n n8n
</code></pre>
<h3 id="heading-maintenance-powershell-script">Maintenance PowerShell Script</h3>
<p>We created a maintenance PowerShell script for routine operations:</p>
<pre><code class="lang-powershell"><span class="hljs-comment"># manage-n8n.ps1 - Common management operations</span>

<span class="hljs-keyword">param</span>(
    [<span class="hljs-type">Parameter</span>(<span class="hljs-type">Mandatory</span>=<span class="hljs-variable">$true</span>)]
    [<span class="hljs-type">ValidateSet</span>(<span class="hljs-string">"status"</span>, <span class="hljs-string">"logs"</span>, <span class="hljs-string">"restart"</span>, <span class="hljs-string">"scale"</span>, <span class="hljs-string">"backup"</span>)]
    [<span class="hljs-built_in">string</span>]<span class="hljs-variable">$Operation</span>,

    [<span class="hljs-type">Parameter</span>(<span class="hljs-type">Mandatory</span>=<span class="hljs-variable">$false</span>)]
    [<span class="hljs-built_in">string</span>]<span class="hljs-variable">$Component</span> = <span class="hljs-string">"n8n"</span>,

    [<span class="hljs-type">Parameter</span>(<span class="hljs-type">Mandatory</span>=<span class="hljs-variable">$false</span>)]
    [<span class="hljs-built_in">int</span>]<span class="hljs-variable">$Replicas</span> = <span class="hljs-number">0</span>
)

<span class="hljs-comment"># Script implementation omitted for brevity</span>
<span class="hljs-comment"># See full script in the repository</span>
</code></pre>
<p>This script simplifies common maintenance tasks and ensures consistent procedures.</p>
<h2 id="heading-performance-optimization">Performance Optimization</h2>
<h3 id="heading-resource-tuning">Resource Tuning</h3>
<p>Based on performance monitoring, we optimized resource allocations:</p>
<ol>
<li><p><strong>n8n Workers</strong>:</p>
<ul>
<li>Increased memory allocation to 1.5Gi for complex workflows</li>
<li>Fine-tuned CPU requests based on actual usage patterns</li>
<li>Adjusted HPA thresholds to scale earlier</li>
</ul>
</li>
<li><p><strong>PostgreSQL</strong>:</p>
<ul>
<li>Optimized shared_buffers and work_mem settings</li>
<li>Implemented connection pooling with PgBouncer</li>
<li>Added indexes for frequently queried fields</li>
</ul>
</li>
</ol>
<p>Implementation for PostgreSQL tuning:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">ConfigMap</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">postgres-config</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">n8n</span>
<span class="hljs-attr">data:</span>
  <span class="hljs-attr">postgresql.conf:</span> <span class="hljs-string">|
    shared_buffers = 256MB
    work_mem = 16MB
    maintenance_work_mem = 64MB
    effective_cache_size = 768MB
    max_connections = 100
    # Additional optimized settings omitted for brevity</span>
</code></pre>
<h3 id="heading-n8n-configuration-optimization">n8n Configuration Optimization</h3>
<p>We fine-tuned n8n configuration based on production usage patterns:</p>
<ol>
<li><p><strong>Workflow Execution Settings</strong>:</p>
<ul>
<li>Adjusted <code>EXECUTIONS_PROCESS</code> for optimal resource usage</li>
<li>Configured execution timeout parameters for long-running workflows</li>
<li>Optimized retry mechanisms for external service connections</li>
</ul>
</li>
<li><p><strong>Queue Management</strong>:</p>
<ul>
<li>Implemented queue priority settings for critical workflows</li>
<li>Configured dedicated queues for different workflow types</li>
<li>Optimized job concurrency settings per worker</li>
</ul>
</li>
</ol>
<h2 id="heading-cost-optimization">Cost Optimization</h2>
<h3 id="heading-resource-right-sizing">Resource Right-Sizing</h3>
<p>We implemented several cost optimization strategies:</p>
<ol>
<li><p><strong>Node Pools and VM Sizing</strong>:</p>
<ul>
<li>Used Azure Spot Instances for worker nodes (50-80% cost savings)</li>
<li>Implemented node auto-scaling to reduce idle capacity</li>
<li>Right-sized VM types based on actual usage patterns</li>
</ul>
</li>
<li><p><strong>Storage Optimization</strong>:</p>
<ul>
<li>Implemented log retention policies</li>
<li>Used premium storage only for performance-critical components</li>
<li>Set up automatic storage cleanup for temporary data</li>
</ul>
</li>
</ol>
<h3 id="heading-cost-analysis">Cost Analysis</h3>
<p>We conducted a comprehensive cost analysis:</p>
<pre><code>Monthly Cost Breakdown:
- AKS Nodes (<span class="hljs-number">2</span> x D2s v3): $<span class="hljs-number">140.16</span>
- Storage (Premium SSD, <span class="hljs-number">64</span> GB): $<span class="hljs-number">10.44</span>
- Networking (Load Balancer, Outbound): $<span class="hljs-number">23.00</span>
- Monitoring: $<span class="hljs-number">7.50</span>
- Backups: $<span class="hljs-number">5.20</span>
----------------------------------
Total Estimated Monthly Cost: $<span class="hljs-number">186.30</span>
</code></pre><p>Cost optimization reduced the original estimate by approximately 30%.</p>
<h2 id="heading-operational-architecture">Operational Architecture</h2>
<p>The complete operational architecture with monitoring components can be visualized as:</p>
<pre><code class="lang-mermaid">flowchart TB
    subgraph "Azure AKS Cluster"
        subgraph "n8n Workloads"
            n8n["n8n Main"]
            workers["n8n Workers"]
            pg["PostgreSQL"]
            redis["Redis"]
        end

        subgraph "Monitoring"
            azm["Azure Monitor"]
            la["Log Analytics"]
            ai["Application Insights"]
        end

        subgraph "Operations"
            backup["Backup CronJob"]
            hpa["HPA Controller"]
        end
    end

    subgraph "Azure Services"
        storage["Azure Storage\n(Backups)"]
        alerts["Azure Alerts"]
        dashboard["Azure Dashboard"]
    end

    n8n --&gt; azm
    workers --&gt; azm
    pg --&gt; azm
    redis --&gt; azm

    azm --&gt; la
    la --&gt; ai

    backup --&gt; pg
    backup --&gt; storage
    hpa --&gt; workers

    azm --&gt; alerts
    la --&gt; dashboard

    style azm fill:#f9f,stroke:#333
    style la fill:#f9f,stroke:#333
    style ai fill:#f9f,stroke:#333
    style backup fill:#ff9,stroke:#333
    style hpa fill:#ff9,stroke:#333
</code></pre>
<h2 id="heading-health-checks-and-validation">Health Checks and Validation</h2>
<h3 id="heading-comprehensive-health-check-script">Comprehensive Health Check Script</h3>
<p>We created a comprehensive health check script to verify all components:</p>
<pre><code class="lang-bash"><span class="hljs-meta">#!/bin/bash</span>
<span class="hljs-comment"># health-check.sh - Verify all components of n8n deployment</span>

<span class="hljs-built_in">echo</span> <span class="hljs-string">"Checking pod status..."</span>
kubectl get pods -n n8n

<span class="hljs-built_in">echo</span> <span class="hljs-string">"Checking service endpoints..."</span>
kubectl get endpoints -n n8n

<span class="hljs-built_in">echo</span> <span class="hljs-string">"Checking certificate status..."</span>
kubectl get certificate -n n8n

<span class="hljs-built_in">echo</span> <span class="hljs-string">"Checking database connection..."</span>
kubectl <span class="hljs-built_in">exec</span> -it $(kubectl get pod -l app=n8n -n n8n -o jsonpath=<span class="hljs-string">'{.items[0].metadata.name}'</span>) -n n8n -- \
  node -e <span class="hljs-string">"const { Pool } = require('pg'); const pool = new Pool({connectionString: process.env.DB_POSTGRESDB_URL}); pool.query('SELECT NOW()', (err, res) =&gt; { console.log(err || res.rows[0]); pool.end(); })"</span>

<span class="hljs-built_in">echo</span> <span class="hljs-string">"Checking Redis connection..."</span>
kubectl <span class="hljs-built_in">exec</span> -it $(kubectl get pod -l app=n8n -n n8n -o jsonpath=<span class="hljs-string">'{.items[0].metadata.name}'</span>) -n n8n -- \
  node -e <span class="hljs-string">"const Redis = require('ioredis'); const redis = new Redis(process.env.QUEUE_BULL_REDIS_HOST); redis.ping().then(res =&gt; { console.log(res); redis.disconnect(); })"</span>

<span class="hljs-built_in">echo</span> <span class="hljs-string">"Checking external access..."</span>
curl -I https://n8n.behooked.co
</code></pre>
<p>This script provides a quick way to validate all aspects of the deployment.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>With our monitoring, maintenance, and optimization strategies in place, our n8n deployment is truly production-ready. We can proactively identify issues, maintain system health, and optimize resources for both performance and cost efficiency.</p>
<p>In the next article, we'll explore comprehensive troubleshooting approaches for common issues you might encounter with your n8n deployment. [Continue to Part 7: Troubleshooting Guide]</p>
<hr />
<p>What monitoring tools have you found most effective for Kubernetes workloads? Are there specific metrics you focus on for workflow automation systems? Share your experiences in the comments!</p>
]]></content:encoded></item><item><title><![CDATA[Securing n8n on Kubernetes: Ingress, SSL/TLS, and Best Practices]]></title><description><![CDATA[This is Part 5 of the "Building a Production-Ready n8n Workflow Automation Platform on Azure Kubernetes Service" series. View the complete series here.
Configuring External Access and SSL/TLS
With our n8n application successfully deployed, we need to...]]></description><link>https://nikhilmishra.xyz/securing-n8n-on-kubernetes-ingress-ssltls-and-best-practices</link><guid isPermaLink="true">https://nikhilmishra.xyz/securing-n8n-on-kubernetes-ingress-ssltls-and-best-practices</guid><category><![CDATA[secure workflow automatio]]></category><category><![CDATA[n8n]]></category><category><![CDATA[Security]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[SSL]]></category><category><![CDATA[SSL Configuration]]></category><category><![CDATA[WorkflowConfiguration]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Sun, 02 Mar 2025 09:03:47 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740906120675/76f6fc2d-7801-4a4e-a5d9-5662562eaab5.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This is Part 5 of the "Building a Production-Ready n8n Workflow Automation Platform on Azure Kubernetes Service" series. <a target="_blank" href="https://nikhilmishra.live/series/n8n-azure-k8s">View the complete series here</a>.</em></p>
<h1 id="heading-configuring-external-access-and-ssltls">Configuring External Access and SSL/TLS</h1>
<p>With our n8n application successfully deployed, we need to make it securely accessible from the internet. This involves:</p>
<ol>
<li>Setting up an Ingress resource to route traffic to n8n</li>
<li>Implementing SSL/TLS encryption for secure communication</li>
<li>Configuring DNS for external access</li>
</ol>
<p>Let's implement these components to complete our production deployment.</p>
<h2 id="heading-implementing-cert-manager-for-ssltls">Implementing Cert-Manager for SSL/TLS</h2>
<h3 id="heading-why-ssltls-is-critical">Why SSL/TLS is Critical</h3>
<p>For a production workflow automation system, SSL/TLS encryption is essential because:</p>
<ul>
<li>It protects sensitive data transmitted between clients and n8n</li>
<li>It prevents man-in-the-middle attacks</li>
<li>It builds trust with users and external services</li>
<li>It's required for many modern browser features</li>
<li>It's a prerequisite for compliance with security standards</li>
</ul>
<h3 id="heading-installing-cert-manager">Installing Cert-Manager</h3>
<p>We used cert-manager to automate the issuance and renewal of SSL/TLS certificates from Let's Encrypt:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Add the Jetstack Helm repository</span>
helm repo add jetstack https://charts.jetstack.io
helm repo update

<span class="hljs-comment"># Install cert-manager with CRDs</span>
helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --<span class="hljs-built_in">set</span> installCRDs=<span class="hljs-literal">true</span>
</code></pre>
<p>After installation, we verified cert-manager was running correctly:</p>
<pre><code class="lang-bash">kubectl get pods -n cert-manager
</code></pre>
<p>Expected output:</p>
<pre><code>NAME                                       READY   STATUS    RESTARTS   AGE
cert-manager-xxxxxxxxx-xxxxx               <span class="hljs-number">1</span>/<span class="hljs-number">1</span>     Running   <span class="hljs-number">0</span>          <span class="hljs-number">2</span>m
cert-manager-cainjector-xxxxxxxxx-xxxxx    <span class="hljs-number">1</span>/<span class="hljs-number">1</span>     Running   <span class="hljs-number">0</span>          <span class="hljs-number">2</span>m
cert-manager-webhook-xxxxxxxxx-xxxxx       <span class="hljs-number">1</span>/<span class="hljs-number">1</span>     Running   <span class="hljs-number">0</span>          <span class="hljs-number">2</span>m
</code></pre><h3 id="heading-configuring-a-clusterissuer-for-lets-encrypt">Configuring a ClusterIssuer for Let's Encrypt</h3>
<p>With cert-manager installed, we created a ClusterIssuer resource to integrate with Let's Encrypt:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">cert-manager.io/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">ClusterIssuer</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">letsencrypt-prod</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">acme:</span>
    <span class="hljs-comment"># The ACME server URL for Let's Encrypt production</span>
    <span class="hljs-attr">server:</span> <span class="hljs-string">https://acme-v02.api.letsencrypt.org/directory</span>
    <span class="hljs-comment"># Email address for Important account notifications</span>
    <span class="hljs-attr">email:</span> <span class="hljs-string">your-email@example.com</span>
    <span class="hljs-comment"># Name of a secret used to store the ACME account private key</span>
    <span class="hljs-attr">privateKeySecretRef:</span>
      <span class="hljs-attr">name:</span> <span class="hljs-string">letsencrypt-prod-account-key</span>
    <span class="hljs-comment"># Enable the HTTP-01 challenge provider</span>
    <span class="hljs-attr">solvers:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">http01:</span>
        <span class="hljs-attr">ingress:</span>
          <span class="hljs-attr">class:</span> <span class="hljs-string">nginx</span>
</code></pre>
<p>This configuration:</p>
<ul>
<li>Uses Let's Encrypt's production ACME server</li>
<li>Specifies your email for notifications about certificate expiry</li>
<li>Uses the HTTP-01 challenge method for domain validation</li>
<li>Associates with our nginx ingress controller</li>
</ul>
<p>We applied this configuration:</p>
<pre><code class="lang-bash">kubectl apply -f cluster-issuer.yaml
</code></pre>
<h2 id="heading-ingress-configuration-with-ssltls">Ingress Configuration with SSL/TLS</h2>
<h3 id="heading-setting-up-dns">Setting Up DNS</h3>
<p>Before configuring the Ingress, we created a DNS A record pointing to the external IP of our NGINX Ingress Controller:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Record Type</td><td>Name</td><td>Value</td><td>TTL</td></tr>
</thead>
<tbody>
<tr>
<td>A</td><td>n8n.behooked.co</td><td>74.179.239.172</td><td>3600</td></tr>
</tbody>
</table>
</div><blockquote>
<p>Note: Use your actual domain and the external IP from your NGINX Ingress Controller.</p>
</blockquote>
<h3 id="heading-creating-the-ingress-resource">Creating the Ingress Resource</h3>
<p>Now we can create an Ingress resource to route external traffic to our n8n service and configure SSL/TLS:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">networking.k8s.io/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Ingress</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">n8n-ingress</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">n8n</span>
  <span class="hljs-attr">annotations:</span>
    <span class="hljs-attr">kubernetes.io/ingress.class:</span> <span class="hljs-string">"nginx"</span>
    <span class="hljs-attr">cert-manager.io/cluster-issuer:</span> <span class="hljs-string">"letsencrypt-prod"</span>
    <span class="hljs-attr">nginx.ingress.kubernetes.io/ssl-redirect:</span> <span class="hljs-string">"true"</span>
    <span class="hljs-attr">nginx.ingress.kubernetes.io/proxy-body-size:</span> <span class="hljs-string">"50m"</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">tls:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">hosts:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">n8n.behooked.co</span>
    <span class="hljs-attr">secretName:</span> <span class="hljs-string">n8n-tls-secret</span>
  <span class="hljs-attr">rules:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">host:</span> <span class="hljs-string">n8n.behooked.co</span>
    <span class="hljs-attr">http:</span>
      <span class="hljs-attr">paths:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">path:</span> <span class="hljs-string">/</span>
        <span class="hljs-attr">pathType:</span> <span class="hljs-string">Prefix</span>
        <span class="hljs-attr">backend:</span>
          <span class="hljs-attr">service:</span>
            <span class="hljs-attr">name:</span> <span class="hljs-string">n8n</span>
            <span class="hljs-attr">port:</span>
              <span class="hljs-attr">number:</span> <span class="hljs-number">5678</span>
</code></pre>
<p>Key aspects of this configuration:</p>
<ul>
<li>References our ClusterIssuer to automatically obtain a certificate</li>
<li>Enables SSL redirection to force HTTPS</li>
<li>Increases the allowed body size for uploading files to n8n</li>
<li>Routes all traffic for our domain to the n8n service</li>
<li>Specifies the TLS secret where the certificate will be stored</li>
</ul>
<p>We applied this configuration:</p>
<pre><code class="lang-bash">kubectl apply -f n8n-ingress.yaml
</code></pre>
<h3 id="heading-verifying-certificate-issuance">Verifying Certificate Issuance</h3>
<p>After applying the Ingress resource, cert-manager automatically requests a certificate from Let's Encrypt. We checked the status with:</p>
<pre><code class="lang-bash">kubectl get certificate -n n8n
</code></pre>
<p>Expected output when successful:</p>
<pre><code>NAME             READY   SECRET           AGE
n8n-tls-secret   True    n8n-tls-secret   <span class="hljs-number">3</span>m
</code></pre><p>If the READY status is "False", we can check for issues:</p>
<pre><code class="lang-bash">kubectl describe certificate n8n-tls-secret -n n8n
</code></pre>
<h2 id="heading-external-access-flow">External Access Flow</h2>
<p>The following diagram illustrates how external requests flow through our system:</p>
<pre><code class="lang-mermaid">sequenceDiagram
    participant User as User
    participant DNS as DNS (n8n.behooked.co)
    participant Ingress as NGINX Ingress
    participant CM as Cert-Manager
    participant LE as Let's Encrypt
    participant n8n as n8n Service

    User-&gt;&gt;DNS: Request n8n.behooked.co
    DNS--&gt;&gt;User: Resolve to Ingress IP

    User-&gt;&gt;Ingress: HTTPS Request

    alt First Request / Certificate Issuance
        Ingress-&gt;&gt;CM: Check for certificate
        CM-&gt;&gt;LE: Request certificate
        LE-&gt;&gt;CM: Challenge for domain validation
        CM-&gt;&gt;Ingress: Configure validation endpoint
        LE-&gt;&gt;Ingress: Validate domain ownership
        LE--&gt;&gt;CM: Issue certificate
        CM--&gt;&gt;Ingress: Store certificate
    end

    Ingress-&gt;&gt;Ingress: TLS Termination
    Ingress-&gt;&gt;n8n: Forward request
    n8n--&gt;&gt;Ingress: Response
    Ingress--&gt;&gt;User: Encrypted response
</code></pre>
<p>This process provides:</p>
<ul>
<li>Automatic certificate issuance and renewal</li>
<li>End-to-end encryption for all external traffic</li>
<li>Simplified certificate management</li>
</ul>
<h2 id="heading-external-access-architecture">External Access Architecture</h2>
<p>The complete external access architecture can be visualized as:</p>
<pre><code class="lang-mermaid">flowchart LR
    users[("Internet Users")]
    domain["Domain:\nn8n.behooked.co"]

    subgraph "Azure AKS"
        ingress["NGINX Ingress\nController"]
        cert["Cert-Manager"]

        subgraph "n8n Namespace"
            n8nSvc["n8n Service"]
            n8nPod["n8n Pod"]
        end

        ingress --&gt;|"HTTPS\nPort 443"| n8nSvc
        n8nSvc --&gt; n8nPod
        cert -.-&gt;|"Manages\nCertificates"| ingress
    end

    users --&gt;|"HTTPS\nPort 443"| domain
    domain --&gt;|"A Record\n74.179.239.172"| ingress

    style ingress fill:#f96,stroke:#333
    style cert fill:#ff9,stroke:#333
    style n8nSvc fill:#9cf,stroke:#333
    style n8nPod fill:#9fc,stroke:#333
</code></pre>
<h2 id="heading-security-considerations">Security Considerations</h2>
<p>Our external access implementation includes several security enhancements:</p>
<ol>
<li><strong>Force HTTPS</strong>: All HTTP requests are automatically redirected to HTTPS</li>
<li><strong>Modern TLS</strong>: Let's Encrypt provides modern TLS certificates with strong encryption</li>
<li><strong>Automatic Renewal</strong>: Certificates are renewed automatically before they expire</li>
<li><strong>Rate Limiting</strong>: Can be configured on the Ingress to prevent abuse</li>
</ol>
<h2 id="heading-validation">Validation</h2>
<p>To validate our external access configuration, we performed several checks:</p>
<h3 id="heading-1-ingress-status">1. Ingress Status</h3>
<pre><code class="lang-bash">kubectl get ingress -n n8n
</code></pre>
<p>Expected output:</p>
<pre><code>NAME          CLASS    HOSTS              ADDRESS          PORTS     AGE
n8n-ingress   &lt;none&gt;   n8n.behooked.co    <span class="hljs-number">74.179</span><span class="hljs-number">.239</span><span class="hljs-number">.172</span>   <span class="hljs-number">80</span>, <span class="hljs-number">443</span>   <span class="hljs-number">5</span>m
</code></pre><h3 id="heading-2-certificate-status">2. Certificate Status</h3>
<pre><code class="lang-bash">kubectl get certificate -n n8n
</code></pre>
<p>Expected output:</p>
<pre><code>NAME             READY   SECRET           AGE
n8n-tls-secret   True    n8n-tls-secret   <span class="hljs-number">5</span>m
</code></pre><h3 id="heading-3-browser-access">3. Browser Access</h3>
<p>We accessed <code>https://n8n.behooked.co</code> in a browser to verify:</p>
<ul>
<li>The site loads correctly</li>
<li>The connection is secure (padlock icon)</li>
<li>The certificate is valid and issued by Let's Encrypt</li>
</ul>
<h3 id="heading-4-certificate-details">4. Certificate Details</h3>
<p>We also examined the certificate details in the browser to confirm:</p>
<ul>
<li>The correct domain name</li>
<li>Valid issue and expiry dates</li>
<li>Let's Encrypt as the Certificate Authority</li>
</ul>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Our n8n deployment is now securely accessible from the internet with HTTPS encryption, thanks to our Ingress configuration and Let's Encrypt integration. Users can safely access the n8n interface and external systems can securely connect to webhooks.</p>
<p>In the next article, we'll implement monitoring, maintenance procedures, and optimization techniques to ensure our deployment remains healthy and efficient. [Continue to Part 6: Monitoring and Optimization]</p>
<hr />
<p>What challenges have you faced when implementing SSL/TLS for your Kubernetes applications? Have you used Let's Encrypt or other certificate providers? Share your insights in the comments!</p>
]]></content:encoded></item><item><title><![CDATA[Scaling n8n with Queue Mode on Kubernetes: Worker Deployment Guide]]></title><description><![CDATA[This is Part 4 of the "Building a Production-Ready n8n Workflow Automation Platform on Azure Kubernetes Service" series. View the complete series here.
Implementing the Application Layer
With our data layer in place, we can now implement the applicat...]]></description><link>https://nikhilmishra.xyz/scaling-n8n-with-queue-mode-on-kubernetes-worker-deployment-guide</link><guid isPermaLink="true">https://nikhilmishra.xyz/scaling-n8n-with-queue-mode-on-kubernetes-worker-deployment-guide</guid><category><![CDATA[workers scaling]]></category><category><![CDATA[n8n]]></category><category><![CDATA[queue]]></category><category><![CDATA[distributed system]]></category><category><![CDATA[workers]]></category><category><![CDATA[AKS,Azure kubernetes services]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Sun, 02 Mar 2025 08:54:48 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740905556974/6ffeefb5-3a48-4d52-a8d2-35bdc78835fb.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This is Part 4 of the "Building a Production-Ready n8n Workflow Automation Platform on Azure Kubernetes Service" series. <a target="_blank" href="https://nikhilmishra.live/series/n8n-azure-k8s">View the complete series here</a>.</em></p>
<h1 id="heading-implementing-the-application-layer">Implementing the Application Layer</h1>
<p>With our data layer in place, we can now implement the application layer of our n8n deployment. This consists of two main components:</p>
<ol>
<li><strong>n8n Main</strong>: The primary n8n instance that serves the UI and API</li>
<li><strong>n8n Workers</strong>: Dedicated execution nodes that process workflows</li>
</ol>
<p>Let's configure these components for a production-ready deployment.</p>
<h2 id="heading-n8n-configuration-and-environment-variables">n8n Configuration and Environment Variables</h2>
<p>Before deploying n8n, we need to understand the key configuration options available.</p>
<h3 id="heading-key-environment-variables">Key Environment Variables</h3>
<p>n8n can be configured through numerous environment variables. The most important ones for our deployment are:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Variable</td><td>Description</td><td>Value for Our Setup</td></tr>
</thead>
<tbody>
<tr>
<td><code>DB_TYPE</code></td><td>Database type</td><td><code>postgresdb</code></td></tr>
<tr>
<td><code>DB_POSTGRESDB_HOST</code></td><td>PostgreSQL hostname</td><td><code>postgres-service</code></td></tr>
<tr>
<td><code>DB_POSTGRESDB_PORT</code></td><td>PostgreSQL port</td><td><code>5432</code></td></tr>
<tr>
<td><code>DB_POSTGRESDB_DATABASE</code></td><td>Database name</td><td><code>n8n</code></td></tr>
<tr>
<td><code>DB_POSTGRESDB_USER</code></td><td>Database user</td><td><code>n8n</code></td></tr>
<tr>
<td><code>DB_POSTGRESDB_PASSWORD</code></td><td>Database password</td><td><code>[secured]</code></td></tr>
<tr>
<td><code>EXECUTIONS_MODE</code></td><td>Execution mode</td><td><code>queue</code></td></tr>
<tr>
<td><code>QUEUE_BULL_REDIS_HOST</code></td><td>Redis hostname</td><td><code>redis-service</code></td></tr>
<tr>
<td><code>QUEUE_BULL_REDIS_PORT</code></td><td>Redis port</td><td><code>6379</code></td></tr>
<tr>
<td><code>N8N_ENCRYPTION_KEY</code></td><td>Encryption key for credentials</td><td><code>[secured]</code></td></tr>
<tr>
<td><code>WEBHOOK_TUNNEL_URL</code></td><td>External webhook URL</td><td><code>https://n8n.yourdomain.com</code></td></tr>
</tbody>
</table>
</div><h3 id="heading-n8n-secrets">n8n Secrets</h3>
<p>We created Kubernetes secrets to store sensitive values:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Secret</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">n8n-secret</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">n8n</span>
<span class="hljs-attr">type:</span> <span class="hljs-string">Opaque</span>
<span class="hljs-attr">data:</span>
  <span class="hljs-attr">N8N_ENCRYPTION_KEY:</span> <span class="hljs-string">YVZ2UnlSeXdWN1VjWjAzcWdzQWJQUWY0U1ZCV1Y0bWg=</span>  <span class="hljs-comment"># base64 encoded random string</span>
  <span class="hljs-attr">WEBHOOK_TUNNEL_URL:</span> <span class="hljs-string">aHR0cHM6Ly9uOG4uYmVob29rZWQuY28=</span>  <span class="hljs-comment"># base64 encoded URL</span>
  <span class="hljs-attr">DB_POSTGRESDB_USER:</span> <span class="hljs-string">bjhu</span>  <span class="hljs-comment"># base64 encoded "n8n"</span>
  <span class="hljs-attr">DB_POSTGRESDB_PASSWORD:</span> <span class="hljs-string">c2VjdXJlLXBhc3N3b3JkLWhlcmU=</span>  <span class="hljs-comment"># base64 encoded password</span>
</code></pre>
<blockquote>
<p>Note: Always generate a strong random string for the encryption key, as it's used to encrypt credentials stored in the database.</p>
</blockquote>
<h2 id="heading-main-n8n-deployment">Main n8n Deployment</h2>
<p>The main n8n deployment serves the web UI and API, handling user requests and enqueueing workflows for execution.</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">apps/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Deployment</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">n8n</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">n8n</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">replicas:</span> <span class="hljs-number">1</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">matchLabels:</span>
      <span class="hljs-attr">app:</span> <span class="hljs-string">n8n</span>
  <span class="hljs-attr">template:</span>
    <span class="hljs-attr">metadata:</span>
      <span class="hljs-attr">labels:</span>
        <span class="hljs-attr">app:</span> <span class="hljs-string">n8n</span>
    <span class="hljs-attr">spec:</span>
      <span class="hljs-attr">containers:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">n8n</span>
        <span class="hljs-attr">image:</span> <span class="hljs-string">n8nio/n8n:latest</span>
        <span class="hljs-attr">ports:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">containerPort:</span> <span class="hljs-number">5678</span>
        <span class="hljs-attr">env:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">DB_TYPE</span>
          <span class="hljs-attr">value:</span> <span class="hljs-string">"postgresdb"</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">DB_POSTGRESDB_HOST</span>
          <span class="hljs-attr">value:</span> <span class="hljs-string">"postgres-service"</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">DB_POSTGRESDB_PORT</span>
          <span class="hljs-attr">value:</span> <span class="hljs-string">"5432"</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">DB_POSTGRESDB_DATABASE</span>
          <span class="hljs-attr">value:</span> <span class="hljs-string">"n8n"</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">DB_POSTGRESDB_USER</span>
          <span class="hljs-attr">valueFrom:</span>
            <span class="hljs-attr">secretKeyRef:</span>
              <span class="hljs-attr">name:</span> <span class="hljs-string">n8n-secret</span>
              <span class="hljs-attr">key:</span> <span class="hljs-string">DB_POSTGRESDB_USER</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">DB_POSTGRESDB_PASSWORD</span>
          <span class="hljs-attr">valueFrom:</span>
            <span class="hljs-attr">secretKeyRef:</span>
              <span class="hljs-attr">name:</span> <span class="hljs-string">n8n-secret</span>
              <span class="hljs-attr">key:</span> <span class="hljs-string">DB_POSTGRESDB_PASSWORD</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">N8N_ENCRYPTION_KEY</span>
          <span class="hljs-attr">valueFrom:</span>
            <span class="hljs-attr">secretKeyRef:</span>
              <span class="hljs-attr">name:</span> <span class="hljs-string">n8n-secret</span>
              <span class="hljs-attr">key:</span> <span class="hljs-string">N8N_ENCRYPTION_KEY</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">WEBHOOK_TUNNEL_URL</span>
          <span class="hljs-attr">valueFrom:</span>
            <span class="hljs-attr">secretKeyRef:</span>
              <span class="hljs-attr">name:</span> <span class="hljs-string">n8n-secret</span>
              <span class="hljs-attr">key:</span> <span class="hljs-string">WEBHOOK_TUNNEL_URL</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">EXECUTIONS_MODE</span>
          <span class="hljs-attr">value:</span> <span class="hljs-string">"queue"</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">QUEUE_BULL_REDIS_HOST</span>
          <span class="hljs-attr">value:</span> <span class="hljs-string">"redis-service"</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">QUEUE_BULL_REDIS_PORT</span>
          <span class="hljs-attr">value:</span> <span class="hljs-string">"6379"</span>
        <span class="hljs-attr">resources:</span>
          <span class="hljs-attr">requests:</span>
            <span class="hljs-attr">memory:</span> <span class="hljs-string">"512Mi"</span>
            <span class="hljs-attr">cpu:</span> <span class="hljs-string">"300m"</span>
          <span class="hljs-attr">limits:</span>
            <span class="hljs-attr">memory:</span> <span class="hljs-string">"1Gi"</span>
            <span class="hljs-attr">cpu:</span> <span class="hljs-string">"600m"</span>
        <span class="hljs-attr">livenessProbe:</span>
          <span class="hljs-attr">httpGet:</span>
            <span class="hljs-attr">path:</span> <span class="hljs-string">/healthz</span>
            <span class="hljs-attr">port:</span> <span class="hljs-number">5678</span>
          <span class="hljs-attr">initialDelaySeconds:</span> <span class="hljs-number">30</span>
          <span class="hljs-attr">periodSeconds:</span> <span class="hljs-number">10</span>
        <span class="hljs-attr">readinessProbe:</span>
          <span class="hljs-attr">httpGet:</span>
            <span class="hljs-attr">path:</span> <span class="hljs-string">/healthz</span>
            <span class="hljs-attr">port:</span> <span class="hljs-number">5678</span>
          <span class="hljs-attr">initialDelaySeconds:</span> <span class="hljs-number">20</span>
          <span class="hljs-attr">periodSeconds:</span> <span class="hljs-number">5</span>
</code></pre>
<p>Key aspects of this configuration:</p>
<ul>
<li>Uses the official n8n Docker image</li>
<li>Configures database and Redis connection details</li>
<li>Sets the execution mode to <code>queue</code></li>
<li>Configures resource limits</li>
<li>Adds health checks for container reliability</li>
</ul>
<h3 id="heading-n8n-service">n8n Service</h3>
<p>We exposed the n8n deployment through a Kubernetes service:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Service</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">n8n</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">n8n</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">app:</span> <span class="hljs-string">n8n</span>
  <span class="hljs-attr">ports:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">port:</span> <span class="hljs-number">5678</span>
    <span class="hljs-attr">targetPort:</span> <span class="hljs-number">5678</span>
  <span class="hljs-attr">type:</span> <span class="hljs-string">ClusterIP</span>
</code></pre>
<p>This service allows the Ingress controller to route traffic to n8n.</p>
<h2 id="heading-n8n-worker-deployment">n8n Worker Deployment</h2>
<p>One of the key advantages of our architecture is the separation of the n8n UI/API from the workflow execution. This is achieved through dedicated worker nodes.</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">apps/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Deployment</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">n8n-worker</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">n8n</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">replicas:</span> <span class="hljs-number">2</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">matchLabels:</span>
      <span class="hljs-attr">app:</span> <span class="hljs-string">n8n-worker</span>
  <span class="hljs-attr">template:</span>
    <span class="hljs-attr">metadata:</span>
      <span class="hljs-attr">labels:</span>
        <span class="hljs-attr">app:</span> <span class="hljs-string">n8n-worker</span>
    <span class="hljs-attr">spec:</span>
      <span class="hljs-attr">containers:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">n8n-worker</span>
        <span class="hljs-attr">image:</span> <span class="hljs-string">n8nio/n8n:latest</span>
        <span class="hljs-attr">command:</span> [<span class="hljs-string">"n8n"</span>, <span class="hljs-string">"worker"</span>]
        <span class="hljs-attr">env:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">DB_TYPE</span>
          <span class="hljs-attr">value:</span> <span class="hljs-string">"postgresdb"</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">DB_POSTGRESDB_HOST</span>
          <span class="hljs-attr">value:</span> <span class="hljs-string">"postgres-service"</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">DB_POSTGRESDB_PORT</span>
          <span class="hljs-attr">value:</span> <span class="hljs-string">"5432"</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">DB_POSTGRESDB_DATABASE</span>
          <span class="hljs-attr">value:</span> <span class="hljs-string">"n8n"</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">DB_POSTGRESDB_USER</span>
          <span class="hljs-attr">valueFrom:</span>
            <span class="hljs-attr">secretKeyRef:</span>
              <span class="hljs-attr">name:</span> <span class="hljs-string">n8n-secret</span>
              <span class="hljs-attr">key:</span> <span class="hljs-string">DB_POSTGRESDB_USER</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">DB_POSTGRESDB_PASSWORD</span>
          <span class="hljs-attr">valueFrom:</span>
            <span class="hljs-attr">secretKeyRef:</span>
              <span class="hljs-attr">name:</span> <span class="hljs-string">n8n-secret</span>
              <span class="hljs-attr">key:</span> <span class="hljs-string">DB_POSTGRESDB_PASSWORD</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">N8N_ENCRYPTION_KEY</span>
          <span class="hljs-attr">valueFrom:</span>
            <span class="hljs-attr">secretKeyRef:</span>
              <span class="hljs-attr">name:</span> <span class="hljs-string">n8n-secret</span>
              <span class="hljs-attr">key:</span> <span class="hljs-string">N8N_ENCRYPTION_KEY</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">EXECUTIONS_MODE</span>
          <span class="hljs-attr">value:</span> <span class="hljs-string">"queue"</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">QUEUE_BULL_REDIS_HOST</span>
          <span class="hljs-attr">value:</span> <span class="hljs-string">"redis-service"</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">QUEUE_BULL_REDIS_PORT</span>
          <span class="hljs-attr">value:</span> <span class="hljs-string">"6379"</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">QUEUE_BULL_REDIS_PREFIX</span>
          <span class="hljs-attr">value:</span> <span class="hljs-string">"bull"</span>
        <span class="hljs-attr">resources:</span>
          <span class="hljs-attr">requests:</span>
            <span class="hljs-attr">memory:</span> <span class="hljs-string">"512Mi"</span>
            <span class="hljs-attr">cpu:</span> <span class="hljs-string">"300m"</span>
          <span class="hljs-attr">limits:</span>
            <span class="hljs-attr">memory:</span> <span class="hljs-string">"1Gi"</span>
            <span class="hljs-attr">cpu:</span> <span class="hljs-string">"800m"</span>
</code></pre>
<p>The key differences from the main deployment are:</p>
<ul>
<li>Command set to <code>n8n worker</code> to run in worker mode</li>
<li>Multiple replicas for parallel execution</li>
<li>Slightly different resource allocation optimized for workflow execution</li>
<li>No ports exposed (workers don't need to be accessible externally)</li>
</ul>
<h2 id="heading-horizontal-pod-autoscaler-for-workers">Horizontal Pod Autoscaler for Workers</h2>
<p>To handle varying workflow loads efficiently, we implemented autoscaling for the worker nodes:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">autoscaling/v2</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">HorizontalPodAutoscaler</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">n8n-worker-hpa</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">n8n</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">scaleTargetRef:</span>
    <span class="hljs-attr">apiVersion:</span> <span class="hljs-string">apps/v1</span>
    <span class="hljs-attr">kind:</span> <span class="hljs-string">Deployment</span>
    <span class="hljs-attr">name:</span> <span class="hljs-string">n8n-worker</span>
  <span class="hljs-attr">minReplicas:</span> <span class="hljs-number">1</span>
  <span class="hljs-attr">maxReplicas:</span> <span class="hljs-number">5</span>
  <span class="hljs-attr">metrics:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">type:</span> <span class="hljs-string">Resource</span>
    <span class="hljs-attr">resource:</span>
      <span class="hljs-attr">name:</span> <span class="hljs-string">cpu</span>
      <span class="hljs-attr">target:</span>
        <span class="hljs-attr">type:</span> <span class="hljs-string">Utilization</span>
        <span class="hljs-attr">averageUtilization:</span> <span class="hljs-number">70</span>
</code></pre>
<p>This HPA scales the worker deployment based on CPU utilization:</p>
<ul>
<li>Scales up when CPU utilization exceeds 70%</li>
<li>Minimum of 1 worker replica (to save resources during idle periods)</li>
<li>Maximum of 5 worker replicas (to handle peak loads)</li>
</ul>
<h2 id="heading-queue-mode-architecture">Queue Mode Architecture</h2>
<p>The following diagram illustrates how the queue mode works in our architecture:</p>
<pre><code class="lang-mermaid">sequenceDiagram
    participant Client
    participant n8n as n8n Main
    participant Redis
    participant Worker as n8n Worker
    participant DB as PostgreSQL

    Client-&gt;&gt;n8n: Trigger workflow
    n8n-&gt;&gt;DB: Get workflow definition
    n8n-&gt;&gt;Redis: Enqueue workflow execution
    n8n--&gt;&gt;Client: Acknowledge trigger

    loop For each worker
        Worker-&gt;&gt;Redis: Poll for new jobs
        Redis--&gt;&gt;Worker: Return job if available
        Worker-&gt;&gt;DB: Get workflow details
        Worker-&gt;&gt;Worker: Execute workflow
        Worker-&gt;&gt;DB: Store execution results
    end

    Client-&gt;&gt;n8n: Check execution status
    n8n-&gt;&gt;DB: Retrieve execution results
    n8n--&gt;&gt;Client: Return results
</code></pre>
<p>This approach provides several advantages:</p>
<ul>
<li>The main n8n instance remains responsive even during heavy workflow execution</li>
<li>Multiple workflows can execute in parallel across worker nodes</li>
<li>Workers can be scaled independently based on execution load</li>
<li>Workflow execution continues even if the main n8n UI is restarted</li>
</ul>
<h2 id="heading-application-layer-architecture">Application Layer Architecture</h2>
<p>Our complete application layer architecture can be visualized as:</p>
<pre><code class="lang-mermaid">flowchart TD
    subgraph "Application Layer"
        ui["n8n Main\n(UI/API)"]

        subgraph "Worker Pool"
            w1["Worker 1"]
            w2["Worker 2"]
            w3["Worker 3 (scaled)"]
            w4["Worker 4 (scaled)"]
            w5["Worker 5 (scaled)"]
        end

        ui --&gt; w1
        ui --&gt; w2
        ui --&gt; w3
        ui --&gt; w4
        ui --&gt; w5
    end

    subgraph "Data Layer"
        redis[("Redis Queue")]
        db[("PostgreSQL")]
    end

    ui --&gt; redis
    ui --&gt; db

    w1 --&gt; redis
    w1 --&gt; db
    w2 --&gt; redis
    w2 --&gt; db
    w3 --&gt; redis
    w3 --&gt; db
    w4 --&gt; redis
    w4 --&gt; db
    w5 --&gt; redis
    w5 --&gt; db

    client["External Client"] --&gt; ui

    style ui fill:#f96,stroke:#333
    style w1,w2,w3,w4,w5 fill:#69f,stroke:#333
    style redis fill:#bbf,stroke:#333
    style db fill:#6b9,stroke:#333
</code></pre>
<p>The diagram shows:</p>
<ul>
<li>Clear separation between UI and worker instances</li>
<li>Horizontal scaling capability for workers</li>
<li>Shared data infrastructure</li>
<li>Client interaction only with the main n8n instance</li>
</ul>
<h2 id="heading-validation">Validation</h2>
<p>After deploying the application layer, we verified all components were running:</p>
<pre><code class="lang-bash">kubectl get pods -n n8n
</code></pre>
<p>Expected output:</p>
<pre><code>NAME                          READY   STATUS    RESTARTS   AGE
n8n-xxxxxxxxx-xxxxx           <span class="hljs-number">1</span>/<span class="hljs-number">1</span>     Running   <span class="hljs-number">0</span>          <span class="hljs-number">3</span>m
n8n-worker-xxxxxxxxx-xxxxx    <span class="hljs-number">1</span>/<span class="hljs-number">1</span>     Running   <span class="hljs-number">0</span>          <span class="hljs-number">3</span>m
n8n-worker-xxxxxxxxx-xxxxx    <span class="hljs-number">1</span>/<span class="hljs-number">1</span>     Running   <span class="hljs-number">0</span>          <span class="hljs-number">3</span>m
postgres-xxxxxxxxx-xxxxx      <span class="hljs-number">1</span>/<span class="hljs-number">1</span>     Running   <span class="hljs-number">0</span>          <span class="hljs-number">10</span>m
redis-xxxxxxxxx-xxxxx         <span class="hljs-number">1</span>/<span class="hljs-number">1</span>     Running   <span class="hljs-number">0</span>          <span class="hljs-number">8</span>m
</code></pre><p>We also verified the services:</p>
<pre><code class="lang-bash">kubectl get services -n n8n
</code></pre>
<p>With expected output:</p>
<pre><code>NAME              TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
n8n               ClusterIP   <span class="hljs-number">10.0</span>.xxx.xxx     &lt;none&gt;        <span class="hljs-number">5678</span>/TCP   <span class="hljs-number">3</span>m
postgres-service  ClusterIP   <span class="hljs-number">10.0</span>.xxx.xxx     &lt;none&gt;        <span class="hljs-number">5432</span>/TCP   <span class="hljs-number">10</span>m
redis-service     ClusterIP   <span class="hljs-number">10.0</span>.xxx.xxx     &lt;none&gt;        <span class="hljs-number">6379</span>/TCP   <span class="hljs-number">8</span>m
</code></pre><p>With our application layer successfully deployed, we can now move on to the external access layer with Ingress configuration and SSL/TLS setup.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>We've successfully deployed the n8n application layer, including the main instance for the UI/API and worker nodes for distributed execution. Our configuration enables horizontal scaling to handle varying workload demands efficiently.</p>
<p>In the next article, we'll make our n8n instance securely accessible from the internet by configuring the Ingress controller and implementing SSL/TLS with Let's Encrypt. [Continue to Part 5: External Access and Security]</p>
<hr />
<p>How are you handling workflow execution in your automation systems? Have you implemented a queue-based approach like we did here? Share your experiences in the comments!</p>
]]></content:encoded></item><item><title><![CDATA[Deploying PostgreSQL and Redis for n8n on Kubernetes: Complete Guide]]></title><description><![CDATA[This is Part 3 of the "Building a Production-Ready n8n Workflow Automation Platform on Azure Kubernetes Service" series. View the complete series here.
Implementing the Data Layer
A robust data layer is critical for any production workflow system. In...]]></description><link>https://nikhilmishra.xyz/deploying-postgresql-and-redis-for-n8n-on-kubernetes-complete-guide</link><guid isPermaLink="true">https://nikhilmishra.xyz/deploying-postgresql-and-redis-for-n8n-on-kubernetes-complete-guide</guid><category><![CDATA[PostgreSQL]]></category><category><![CDATA[Kubernetes Deployment]]></category><category><![CDATA[n8n]]></category><category><![CDATA[Redis]]></category><category><![CDATA[Workflow Automation]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Sun, 02 Mar 2025 08:43:08 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740904868274/86b690fa-b8ab-4e31-b1a4-3bbaa6c13cc9.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This is Part 3 of the "Building a Production-Ready n8n Workflow Automation Platform on Azure Kubernetes Service" series. <a target="_blank" href="https://nikhilmishra.live/series/n8n-azure-k8s">View the complete series here</a>.</em></p>
<h1 id="heading-implementing-the-data-layer">Implementing the Data Layer</h1>
<p>A robust data layer is critical for any production workflow system. In our n8n deployment, the data layer consists of two primary components:</p>
<ol>
<li><strong>PostgreSQL</strong>: For persistent storage of workflows, credentials, and execution history</li>
<li><strong>Redis</strong>: For queue management and workflow distribution</li>
</ol>
<p>Let's implement each of these components with production-grade configurations.</p>
<h2 id="heading-postgresql-deployment">PostgreSQL Deployment</h2>
<h3 id="heading-why-postgresql-for-n8n">Why PostgreSQL for n8n?</h3>
<p>n8n stores various types of data that require a reliable, ACID-compliant database:</p>
<ul>
<li>Workflow definitions</li>
<li>Execution history</li>
<li>Credentials (encrypted)</li>
<li>User accounts and settings</li>
<li>Tags and other metadata</li>
</ul>
<p>PostgreSQL is an excellent choice for n8n because it offers:</p>
<ul>
<li>Strong data integrity guarantees</li>
<li>Rich feature set for complex queries</li>
<li>Excellent performance for n8n's access patterns</li>
<li>Mature ecosystem with extensive tooling</li>
<li>Open-source with enterprise reliability</li>
</ul>
<h3 id="heading-security-best-practices-for-postgresql">Security Best Practices for PostgreSQL</h3>
<p>For our production deployment, we implemented several security best practices:</p>
<ol>
<li><strong>Dedicated Non-Root User</strong>: Created a specific database user for n8n access</li>
<li><strong>Password Security</strong>: Stored database credentials in Kubernetes Secrets</li>
<li><strong>Network Isolation</strong>: Restricting access to within the Kubernetes cluster only</li>
<li><strong>Resource Limits</strong>: Setting appropriate CPU and memory limits</li>
</ol>
<h3 id="heading-postgresql-kubernetes-secrets">PostgreSQL Kubernetes Secrets</h3>
<p>First, we created a Kubernetes Secret to store database credentials:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Secret</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">postgres-secret</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">n8n</span>
<span class="hljs-attr">type:</span> <span class="hljs-string">Opaque</span>
<span class="hljs-attr">data:</span>
  <span class="hljs-attr">POSTGRES_USER:</span> <span class="hljs-string">cG9zdGdyZXM=</span>  <span class="hljs-comment"># "postgres" base64 encoded</span>
  <span class="hljs-attr">POSTGRES_PASSWORD:</span> <span class="hljs-string">cFstNUpxdHM9UyVGYzMrTEY=</span>  <span class="hljs-comment"># base64 encoded password</span>
  <span class="hljs-attr">POSTGRES_DB:</span> <span class="hljs-string">bjhu</span>  <span class="hljs-comment"># "n8n" base64 encoded</span>
</code></pre>
<blockquote>
<p>Note: For security, always generate strong random passwords for production deployments.</p>
</blockquote>
<p>We applied this secret:</p>
<pre><code class="lang-bash">kubectl apply -f postgres-secret.yaml
</code></pre>
<h3 id="heading-database-initialization-configmap">Database Initialization ConfigMap</h3>
<p>To create a dedicated n8n user in PostgreSQL, we created a ConfigMap with an initialization script:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">ConfigMap</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">postgres-init-script</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">n8n</span>
<span class="hljs-attr">data:</span>
  <span class="hljs-attr">init-db.sh:</span> <span class="hljs-string">|
    #!/bin/bash
    set -e
</span>
    <span class="hljs-string">psql</span> <span class="hljs-string">-v</span> <span class="hljs-string">ON_ERROR_STOP=1</span> <span class="hljs-string">--username</span> <span class="hljs-string">"$POSTGRES_USER"</span> <span class="hljs-string">--dbname</span> <span class="hljs-string">"$POSTGRES_DB"</span> <span class="hljs-string">&lt;&lt;-EOSQL</span>
      <span class="hljs-string">CREATE</span> <span class="hljs-string">USER</span> <span class="hljs-string">n8n</span> <span class="hljs-string">WITH</span> <span class="hljs-string">PASSWORD</span> <span class="hljs-string">'secure-password-here'</span><span class="hljs-string">;</span>
      <span class="hljs-string">GRANT</span> <span class="hljs-string">ALL</span> <span class="hljs-string">PRIVILEGES</span> <span class="hljs-string">ON</span> <span class="hljs-string">DATABASE</span> <span class="hljs-string">n8n</span> <span class="hljs-string">TO</span> <span class="hljs-string">n8n;</span>
      <span class="hljs-string">ALTER</span> <span class="hljs-string">USER</span> <span class="hljs-string">n8n</span> <span class="hljs-string">WITH</span> <span class="hljs-string">SUPERUSER;</span>
    <span class="hljs-string">EOSQL</span>
</code></pre>
<p>This script is executed when PostgreSQL starts, creating an n8n user with appropriate privileges.</p>
<h3 id="heading-postgresql-deployment-configuration">PostgreSQL Deployment Configuration</h3>
<p>Now we can create the PostgreSQL deployment:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">apps/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Deployment</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">postgres</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">n8n</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">replicas:</span> <span class="hljs-number">1</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">matchLabels:</span>
      <span class="hljs-attr">app:</span> <span class="hljs-string">postgres</span>
  <span class="hljs-attr">template:</span>
    <span class="hljs-attr">metadata:</span>
      <span class="hljs-attr">labels:</span>
        <span class="hljs-attr">app:</span> <span class="hljs-string">postgres</span>
    <span class="hljs-attr">spec:</span>
      <span class="hljs-attr">containers:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">postgres</span>
        <span class="hljs-attr">image:</span> <span class="hljs-string">postgres:13</span>
        <span class="hljs-attr">ports:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">containerPort:</span> <span class="hljs-number">5432</span>
        <span class="hljs-attr">env:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">POSTGRES_USER</span>
          <span class="hljs-attr">valueFrom:</span>
            <span class="hljs-attr">secretKeyRef:</span>
              <span class="hljs-attr">name:</span> <span class="hljs-string">postgres-secret</span>
              <span class="hljs-attr">key:</span> <span class="hljs-string">POSTGRES_USER</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">POSTGRES_PASSWORD</span>
          <span class="hljs-attr">valueFrom:</span>
            <span class="hljs-attr">secretKeyRef:</span>
              <span class="hljs-attr">name:</span> <span class="hljs-string">postgres-secret</span>
              <span class="hljs-attr">key:</span> <span class="hljs-string">POSTGRES_PASSWORD</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">POSTGRES_DB</span>
          <span class="hljs-attr">valueFrom:</span>
            <span class="hljs-attr">secretKeyRef:</span>
              <span class="hljs-attr">name:</span> <span class="hljs-string">postgres-secret</span>
              <span class="hljs-attr">key:</span> <span class="hljs-string">POSTGRES_DB</span>
        <span class="hljs-attr">volumeMounts:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">postgres-data</span>
          <span class="hljs-attr">mountPath:</span> <span class="hljs-string">/var/lib/postgresql/data</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">init-script</span>
          <span class="hljs-attr">mountPath:</span> <span class="hljs-string">/docker-entrypoint-initdb.d</span>
        <span class="hljs-attr">resources:</span>
          <span class="hljs-attr">requests:</span>
            <span class="hljs-attr">memory:</span> <span class="hljs-string">"512Mi"</span>
            <span class="hljs-attr">cpu:</span> <span class="hljs-string">"500m"</span>
          <span class="hljs-attr">limits:</span>
            <span class="hljs-attr">memory:</span> <span class="hljs-string">"1Gi"</span>
            <span class="hljs-attr">cpu:</span> <span class="hljs-string">"1000m"</span>
      <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">postgres-data</span>
        <span class="hljs-attr">persistentVolumeClaim:</span>
          <span class="hljs-attr">claimName:</span> <span class="hljs-string">postgres-data-claim</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">init-script</span>
        <span class="hljs-attr">configMap:</span>
          <span class="hljs-attr">name:</span> <span class="hljs-string">postgres-init-script</span>
</code></pre>
<p>Key aspects of this configuration:</p>
<ul>
<li>Uses the PostgreSQL 13 image</li>
<li>Mounts the persistent volume claim for data storage</li>
<li>Mounts the initialization script ConfigMap</li>
<li>Sets resource limits to ensure stability</li>
<li>Uses environment variables from Kubernetes Secrets</li>
</ul>
<h3 id="heading-postgresql-service">PostgreSQL Service</h3>
<p>To make PostgreSQL accessible to other pods in the cluster, we created a Kubernetes Service:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Service</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">postgres-service</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">n8n</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">app:</span> <span class="hljs-string">postgres</span>
  <span class="hljs-attr">ports:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">port:</span> <span class="hljs-number">5432</span>
    <span class="hljs-attr">targetPort:</span> <span class="hljs-number">5432</span>
  <span class="hljs-attr">type:</span> <span class="hljs-string">ClusterIP</span>
</code></pre>
<p>This service provides a stable endpoint (<code>postgres-service</code>) for other components to connect to PostgreSQL.</p>
<h2 id="heading-redis-deployment">Redis Deployment</h2>
<h3 id="heading-redis-for-queue-management">Redis for Queue Management</h3>
<p>Redis serves as the queue manager for n8n's distributed workflow execution. It:</p>
<ul>
<li>Maintains lists of pending workflows</li>
<li>Tracks workflow execution state</li>
<li>Enables worker coordination</li>
<li>Provides fast in-memory operations</li>
</ul>
<h3 id="heading-redis-deployment-configuration">Redis Deployment Configuration</h3>
<p>We deployed Redis with the following configuration:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">apps/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Deployment</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">redis</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">n8n</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">replicas:</span> <span class="hljs-number">1</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">matchLabels:</span>
      <span class="hljs-attr">app:</span> <span class="hljs-string">redis</span>
  <span class="hljs-attr">template:</span>
    <span class="hljs-attr">metadata:</span>
      <span class="hljs-attr">labels:</span>
        <span class="hljs-attr">app:</span> <span class="hljs-string">redis</span>
    <span class="hljs-attr">spec:</span>
      <span class="hljs-attr">containers:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">redis</span>
        <span class="hljs-attr">image:</span> <span class="hljs-string">redis:6-alpine</span>
        <span class="hljs-attr">ports:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">containerPort:</span> <span class="hljs-number">6379</span>
        <span class="hljs-attr">resources:</span>
          <span class="hljs-attr">requests:</span>
            <span class="hljs-attr">memory:</span> <span class="hljs-string">"256Mi"</span>
            <span class="hljs-attr">cpu:</span> <span class="hljs-string">"200m"</span>
          <span class="hljs-attr">limits:</span>
            <span class="hljs-attr">memory:</span> <span class="hljs-string">"512Mi"</span>
            <span class="hljs-attr">cpu:</span> <span class="hljs-string">"500m"</span>
        <span class="hljs-attr">volumeMounts:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">redis-data</span>
          <span class="hljs-attr">mountPath:</span> <span class="hljs-string">/data</span>
      <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">redis-data</span>
        <span class="hljs-attr">emptyDir:</span> {}
</code></pre>
<p>For our use case, we chose a simple Redis deployment without persistence, as the queue data can be regenerated if lost. For more critical deployments, you could add a PersistentVolumeClaim similar to PostgreSQL.</p>
<h3 id="heading-redis-service">Redis Service</h3>
<p>We created a Redis service to make it accessible to n8n components:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Service</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">redis-service</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">n8n</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">app:</span> <span class="hljs-string">redis</span>
  <span class="hljs-attr">ports:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">port:</span> <span class="hljs-number">6379</span>
    <span class="hljs-attr">targetPort:</span> <span class="hljs-number">6379</span>
  <span class="hljs-attr">type:</span> <span class="hljs-string">ClusterIP</span>
</code></pre>
<h2 id="heading-data-layer-architecture-diagram">Data Layer Architecture Diagram</h2>
<p>Our complete data layer architecture can be visualized as:</p>
<pre><code class="lang-mermaid">flowchart LR
    subgraph "Data Layer"
        subgraph "PostgreSQL"
            pg[("PostgreSQL Pod")]
            pv[("Persistent Volume\n64Gi")]
            pg --&gt; pv
        end

        subgraph "Redis"
            redis[("Redis Pod")]
            mem[("In-Memory Storage")]
            redis --&gt; mem
        end
    end

    subgraph "Consumers"
        n8n["n8n Main"]
        workers["n8n Workers"]
    end

    n8n -.-&gt; pg
    n8n -.-&gt; redis
    workers -.-&gt; pg
    workers -.-&gt; redis

    style pv fill:#f9f,stroke:#333
    style mem fill:#bbf,stroke:#333
    style pg fill:#bfb,stroke:#333
    style redis fill:#bfb,stroke:#333
</code></pre>
<p>This architecture provides:</p>
<ul>
<li>Clear separation between stateful services</li>
<li>Persistent storage for critical data</li>
<li>In-memory performance for queue operations</li>
<li>Accessible services for n8n components</li>
</ul>
<h2 id="heading-validation">Validation</h2>
<p>After deploying PostgreSQL and Redis, we verified their status:</p>
<pre><code class="lang-bash">kubectl get pods -n n8n
</code></pre>
<p>Successful output looks like:</p>
<pre><code>NAME                       READY   STATUS    RESTARTS   AGE
postgres-xxxxxxxxx-xxxxx   <span class="hljs-number">1</span>/<span class="hljs-number">1</span>     Running   <span class="hljs-number">0</span>          <span class="hljs-number">2</span>m
redis-xxxxxxxxx-xxxxx      <span class="hljs-number">1</span>/<span class="hljs-number">1</span>     Running   <span class="hljs-number">0</span>          <span class="hljs-number">1</span>m
</code></pre><p>We also verified the services:</p>
<pre><code class="lang-bash">kubectl get services -n n8n
</code></pre>
<p>With output:</p>
<pre><code>NAME              TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
postgres-service  ClusterIP   <span class="hljs-number">10.0</span>.xxx.xxx     &lt;none&gt;        <span class="hljs-number">5432</span>/TCP   <span class="hljs-number">2</span>m
redis-service     ClusterIP   <span class="hljs-number">10.0</span>.xxx.xxx     &lt;none&gt;        <span class="hljs-number">6379</span>/TCP   <span class="hljs-number">1</span>m
</code></pre><p>Now that our data layer is properly set up, we can proceed to deploying the n8n application layer in the next section.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>With our data layer successfully deployed, we have reliable PostgreSQL storage for our workflow definitions and execution history, plus Redis for efficient queue management. These components form the persistence backbone of our n8n deployment.</p>
<p>In the next article, we'll deploy the n8n application itself, including the main service and worker nodes for distributed processing. [Continue to Part 4: Application Layer]</p>
<hr />
<p>Have you implemented PostgreSQL or Redis in Kubernetes before? What database optimization techniques have worked best for your workflow automation systems? Share your insights in the comments!</p>
]]></content:encoded></item><item><title><![CDATA[How to Set Up Azure Kubernetes Service for n8n Workflow Automation]]></title><description><![CDATA[This is Part 2 of the "Building a Production-Ready n8n Workflow Automation Platform on Azure Kubernetes Service" series. View the complete series here.
Setting Up the Foundation
Creating the AKS Cluster
The first step in our implementation is setting...]]></description><link>https://nikhilmishra.xyz/how-to-set-up-azure-kubernetes-service-for-n8n-workflow-automation</link><guid isPermaLink="true">https://nikhilmishra.xyz/how-to-set-up-azure-kubernetes-service-for-n8n-workflow-automation</guid><category><![CDATA[AKS cluster configuration]]></category><category><![CDATA[n8n]]></category><category><![CDATA[AKS,Azure kubernetes services]]></category><category><![CDATA[Workflow Automation]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Sun, 02 Mar 2025 08:33:50 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740904311100/0d8173c0-787d-4a4b-bca9-4174a595dd24.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This is Part 2 of the "Building a Production-Ready n8n Workflow Automation Platform on Azure Kubernetes Service" series. <a class="post-section-overview" href="#https://nikhilmishra.live/series/n8n-azure-k8s">View the complete series here</a>.</em></p>
<h1 id="heading-setting-up-the-foundation">Setting Up the Foundation</h1>
<h2 id="heading-creating-the-aks-cluster">Creating the AKS Cluster</h2>
<p>The first step in our implementation is setting up the Azure Kubernetes Service (AKS) cluster. This forms the foundation of our entire deployment.</p>
<h3 id="heading-resource-planning">Resource Planning</h3>
<p>Before creating the cluster, we determined our resource requirements:</p>
<ul>
<li><strong>Node Count</strong>: 2 nodes for basic high availability</li>
<li><strong>VM Size</strong>: D2s v3 (2 vCPUs, 8GB RAM) for good performance</li>
<li><strong>Region</strong>: East US (chosen for proximity to our users)</li>
<li><strong>Kubernetes Version</strong>: 1.25.5 (stable version with good feature support)</li>
</ul>
<h3 id="heading-creating-the-cluster-with-azure-cli">Creating the Cluster with Azure CLI</h3>
<p>We used Azure CLI for cluster creation to make the process reproducible:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Create resource group</span>
az group create --name n8n-aks-rg --location eastus

<span class="hljs-comment"># Create AKS cluster</span>
az aks create \
  --resource-group n8n-aks-rg \
  --name n8n-cluster \
  --node-count 2 \
  --node-vm-size Standard_D2s_v3 \
  --kubernetes-version 1.25.5 \
  --enable-managed-identity \
  --generate-ssh-keys
</code></pre>
<p>This command creates a basic AKS cluster with managed identity for simplified authentication. The SSH keys are automatically generated for node access if needed.</p>
<h3 id="heading-connecting-to-the-cluster">Connecting to the Cluster</h3>
<p>After cluster creation, we configured <code>kubectl</code> to connect to our new cluster:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Get credentials</span>
az aks get-credentials --resource-group n8n-aks-rg --name n8n-cluster

<span class="hljs-comment"># Verify connection</span>
kubectl get nodes
</code></pre>
<p>The output confirmed our two nodes were running:</p>
<pre><code>NAME                                STATUS   ROLES   AGE   VERSION
aks-nodepool1<span class="hljs-number">-12345678</span>-vmss000000   Ready    agent   <span class="hljs-number">3</span>m    v1<span class="hljs-number">.25</span><span class="hljs-number">.5</span>
aks-nodepool1<span class="hljs-number">-12345678</span>-vmss000001   Ready    agent   <span class="hljs-number">3</span>m    v1<span class="hljs-number">.25</span><span class="hljs-number">.5</span>
</code></pre><h2 id="heading-namespace-organization">Namespace Organization</h2>
<p>We created a dedicated namespace for our n8n deployment to isolate it from other applications that might run on the same cluster:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Create namespace</span>
kubectl create namespace n8n

<span class="hljs-comment"># Set as default namespace for this context</span>
kubectl config set-context --current --namespace=n8n
</code></pre>
<p>Using a dedicated namespace provides several benefits:</p>
<ul>
<li>Resource isolation</li>
<li>Simplified RBAC (Role-Based Access Control)</li>
<li>Clear resource organization</li>
<li>Ability to set resource quotas per namespace</li>
</ul>
<h2 id="heading-network-architecture">Network Architecture</h2>
<h3 id="heading-network-considerations">Network Considerations</h3>
<p>In our design, we addressed several network requirements:</p>
<ol>
<li><strong>External Access</strong>: The n8n UI must be accessible externally via HTTPS</li>
<li><strong>Inter-Service Communication</strong>: Components need to communicate within the cluster</li>
<li><strong>Security</strong>: Network policies to restrict unnecessary communication</li>
</ol>
<h3 id="heading-implementing-the-ingress-controller">Implementing the Ingress Controller</h3>
<p>For external access, we installed the NGINX Ingress Controller:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Add the Helm repository</span>
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update

<span class="hljs-comment"># Install NGINX Ingress Controller</span>
helm install nginx-ingress ingress-nginx/ingress-nginx \
  --namespace default \
  --<span class="hljs-built_in">set</span> controller.replicaCount=2 \
  --<span class="hljs-built_in">set</span> controller.nodeSelector.<span class="hljs-string">"kubernetes\.io/os"</span>=linux \
  --<span class="hljs-built_in">set</span> defaultBackend.nodeSelector.<span class="hljs-string">"kubernetes\.io/os"</span>=linux
</code></pre>
<p>After installation, we verified the ingress controller's external IP:</p>
<pre><code class="lang-bash">kubectl get service nginx-ingress-ingress-nginx-controller
</code></pre>
<p>This returned our external IP (74.179.239.172) which we later used to configure DNS.</p>
<h2 id="heading-storage-classes-and-persistent-volumes">Storage Classes and Persistent Volumes</h2>
<h3 id="heading-storage-architecture">Storage Architecture</h3>
<p>For our n8n deployment, we needed persistent storage for:</p>
<ul>
<li>PostgreSQL database</li>
<li>Redis data (if needed for persistence)</li>
</ul>
<p>AKS provides default storage classes that use Azure Disk or Azure File. We used the default storage class (<code>managed-premium</code>) which creates Azure Premium Managed Disks.</p>
<h3 id="heading-verifying-storage-classes">Verifying Storage Classes</h3>
<p>We checked available storage classes with:</p>
<pre><code class="lang-bash">kubectl get storageclass
</code></pre>
<p>The output confirmed the default storage class:</p>
<pre><code>NAME                   PROVISIONER          RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
managed-premium (<span class="hljs-keyword">default</span>)   disk.csi.azure.com   Delete          Immediate           <span class="hljs-literal">true</span>                   <span class="hljs-number">10</span>m
managed-csi               disk.csi.azure.com   Delete          Immediate           <span class="hljs-literal">true</span>                   <span class="hljs-number">10</span>m
</code></pre><h3 id="heading-creating-persistent-volume-claims">Creating Persistent Volume Claims</h3>
<p>For PostgreSQL, we created a Persistent Volume Claim (PVC) to ensure data persistence:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">PersistentVolumeClaim</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">postgres-data-claim</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">n8n</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">accessModes:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">ReadWriteOnce</span>
  <span class="hljs-attr">resources:</span>
    <span class="hljs-attr">requests:</span>
      <span class="hljs-attr">storage:</span> <span class="hljs-string">64Gi</span>
  <span class="hljs-attr">storageClassName:</span> <span class="hljs-string">managed-premium</span>
</code></pre>
<p>We saved this as <code>postgres-pvc.yaml</code> and applied it:</p>
<pre><code class="lang-bash">kubectl apply -f postgres-pvc.yaml
</code></pre>
<p>This PVC would be used by the PostgreSQL StatefulSet to store database files.</p>
<h2 id="heading-deployment-process">Deployment Process</h2>
<p>The overall deployment process follows this sequence:</p>
<pre><code class="lang-mermaid">flowchart TB
    start([Start Deployment]) --&gt; prereq[Prerequisites Check]
    prereq --&gt; aks[Create AKS Cluster]
    aks --&gt; namespace[Create n8n Namespace]

    namespace --&gt; storage[Configure Storage]
    storage --&gt; secrets[Apply Secrets]

    secrets --&gt; dbPath[Database Path]
    secrets --&gt; redisPath[Redis Path]
    secrets --&gt; n8nPath[n8n Path]

    dbPath --&gt; postgres[Deploy PostgreSQL]
    redisPath --&gt; redis[Deploy Redis]

    postgres --&gt; dbInit[Initialize Database]
    redis --&gt; queueInit[Initialize Queue]

    dbInit --&gt; n8n[Deploy n8n Main]
    queueInit --&gt; n8n

    n8n --&gt; workers[Deploy n8n Workers]
    n8n --&gt; ingress[Configure Ingress] 

    ingress --&gt; certmgr[Install Cert-Manager]
    certmgr --&gt; issuer[Configure ClusterIssuer]
    issuer --&gt; cert[Obtain SSL Certificate]

    cert --&gt; validation[Validation Tests]
    workers --&gt; validation

    validation --&gt; complete([Deployment Complete])

    classDef setup fill:#f9f,stroke:#333,stroke-width:1px
    classDef deployment fill:#bbf,stroke:#333,stroke-width:1px
    classDef config fill:#bfb,stroke:#333,stroke-width:1px
    classDef validation fill:#fbf,stroke:#333,stroke-width:1px

    class start,prereq,aks,namespace setup
    class postgres,redis,n8n,workers deployment
    class storage,secrets,ingress,certmgr,issuer,cert config
    class dbInit,queueInit,validation validation
</code></pre>
<p>This workflow ensures that dependencies are deployed in the correct order, with each component building upon the previous ones. In the next section, we'll set up the foundational data layer with PostgreSQL and Redis.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>With our AKS cluster provisioned and configured, we now have a solid foundation for our n8n deployment. We've set up proper namespaces, configured networking, and prepared our persistent storage requirements.</p>
<p>In the next article, we'll implement the data layer by deploying PostgreSQL and Redis with proper security configurations and persistence. [Continue to Part 3: Data Layer Implementation]</p>
<hr />
<p>What challenges have you faced when setting up Kubernetes clusters for stateful applications? Share your experiences in the comments!</p>
]]></content:encoded></item><item><title><![CDATA[Building a Production-Ready n8n Workflow Automation Platform on AKS: Introduction & Architecture]]></title><description><![CDATA[This is Part 1 of the "Building a Production-Ready n8n Workflow Automation Platform on Azure Kubernetes Service" series. View the complete series here.
Introduction
In today's fast-paced digital landscape, workflow automation has become essential for...]]></description><link>https://nikhilmishra.xyz/building-a-production-ready-n8n-workflow-automation-platform-on-aks-introduction-and-architecture</link><guid isPermaLink="true">https://nikhilmishra.xyz/building-a-production-ready-n8n-workflow-automation-platform-on-aks-introduction-and-architecture</guid><category><![CDATA[n8n]]></category><category><![CDATA[Kubernetes deployments]]></category><category><![CDATA[Workflow Automation]]></category><category><![CDATA[AKS,Azure kubernetes services]]></category><category><![CDATA[workflow automation software]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Sun, 02 Mar 2025 08:26:54 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740903845702/18087b4d-75c6-466d-882e-f006d36926fd.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This is Part 1 of the "Building a Production-Ready n8n Workflow Automation Platform on Azure Kubernetes Service" series. <a class="post-section-overview" href="#https://nikhilmishra.live/series/n8n-azure-k8s">View the complete series here</a>.</em></p>
<h2 id="heading-introduction">Introduction</h2>
<p>In today's fast-paced digital landscape, workflow automation has become essential for businesses to operate efficiently. As organizations grow, the need for robust, scalable automation solutions becomes increasingly crucial. This blog post details our journey of implementing a production-grade <a target="_blank" href="https://n8n.io/">n8n</a> workflow automation platform on Azure Kubernetes Service (AKS).</p>
<h3 id="heading-what-is-n8n">What is n8n?</h3>
<p>n8n (pronounced "n-eight-n") is an open-source workflow automation tool that allows you to connect different services and build automated workflows without writing code. It's like Zapier or Integromat, but with the advantage of being self-hosted, giving you complete control over your data and workflows.</p>
<p>Some key features of n8n include:</p>
<ul>
<li>Visual workflow editor</li>
<li>200+ built-in integrations</li>
<li>Webhooks for real-time triggers</li>
<li>Custom JavaScript functions</li>
<li>Self-hosting capability</li>
<li>API access</li>
</ul>
<h3 id="heading-why-kubernetes-for-n8n">Why Kubernetes for n8n?</h3>
<p>While n8n can run on a simple VM or even locally, a production deployment demands more:</p>
<ul>
<li><strong>High Availability</strong>: Ensure your automation workflows run 24/7</li>
<li><strong>Scalability</strong>: Handle growing workflow volume as your needs increase</li>
<li><strong>Resource Efficiency</strong>: Optimize resource allocation based on actual demand</li>
<li><strong>Disaster Recovery</strong>: Protect against data loss and service interruptions</li>
<li><strong>Simplified Management</strong>: Standardize deployment, updates, and monitoring</li>
</ul>
<p>Kubernetes addresses these requirements by providing a container orchestration platform that can manage complex applications with multiple components.</p>
<h3 id="heading-why-azure-kubernetes-service">Why Azure Kubernetes Service?</h3>
<p>Azure Kubernetes Service (AKS) offers several advantages for hosting n8n:</p>
<ul>
<li><strong>Managed Control Plane</strong>: Focus on your application rather than managing Kubernetes infrastructure</li>
<li><strong>Integrated Security</strong>: Azure Active Directory integration, network security policies, and RBAC</li>
<li><strong>Simple Scaling</strong>: Easy horizontal and vertical scaling of nodes</li>
<li><strong>Azure Integration</strong>: Seamless integration with other Azure services like Storage, Monitoring, and Networking</li>
<li><strong>Cost Optimization</strong>: Pay only for the worker nodes, as the control plane is free</li>
</ul>
<h3 id="heading-first-principles-approach">First Principles Approach</h3>
<p>In this article, we'll take a first principles approach, understanding the fundamental requirements of a production workflow system before implementing our solution. This means:</p>
<ol>
<li>Starting with the core needs (data persistence, execution reliability, security)</li>
<li>Breaking down the system into logical components</li>
<li>Understanding the interactions between components</li>
<li>Building up a robust architecture that meets all requirements</li>
<li>Implementing with best practices for production environments</li>
</ol>
<p>Let's dive into the architecture from first principles.</p>
<h2 id="heading-understanding-the-architecture-first-principles">Understanding the Architecture (First Principles)</h2>
<p>When designing a production workflow system, we need to consider several fundamental requirements:</p>
<h3 id="heading-1-data-persistence">1. Data Persistence</h3>
<p>Workflows and their execution data must be stored reliably. This requires:</p>
<ul>
<li>A database for storing workflow definitions and execution records</li>
<li>Persistent storage that survives container restarts</li>
<li>Backup capabilities for disaster recovery</li>
</ul>
<h3 id="heading-2-execution-reliability">2. Execution Reliability</h3>
<p>Workflow executions must be reliable, even under high load:</p>
<ul>
<li>Queue-based processing to handle spikes in workflow triggers</li>
<li>Worker redundancy to prevent single points of failure</li>
<li>Graceful handling of errors and retries</li>
</ul>
<h3 id="heading-3-security">3. Security</h3>
<p>Sensitive data and external access must be secured:</p>
<ul>
<li>Encryption for data at rest and in transit</li>
<li>Secure storage of credentials and API keys</li>
<li>Authentication and authorization controls</li>
<li>Network security for external access</li>
</ul>
<h3 id="heading-4-scalability">4. Scalability</h3>
<p>The system must scale as workflow needs grow:</p>
<ul>
<li>Horizontal scaling for workers under load</li>
<li>Database performance optimization</li>
<li>Resource allocation efficiency</li>
</ul>
<h3 id="heading-5-maintainability">5. Maintainability</h3>
<p>The deployment must be easy to maintain over time:</p>
<ul>
<li>Monitoring and logging</li>
<li>Simple update processes</li>
<li>Documentation for operational procedures</li>
</ul>
<p>With these first principles in mind, we can design our n8n deployment architecture.</p>
<h2 id="heading-architecture-overview">Architecture Overview</h2>
<p>Our n8n on AKS architecture consists of the following components:</p>
<pre><code class="lang-mermaid">flowchart TD
    subgraph "Azure AKS Cluster"
        subgraph "External Access Layer"
            ingress["NGINX Ingress Controller"]
            cert["Cert-Manager"]
        end

        subgraph "Application Layer"
            n8n["n8n Main\n(UI/API)"]
            worker1["n8n Worker 1"]
            worker2["n8n Worker 2"]
            workern["n8n Worker n\n(Auto-scaled)"]
        end

        subgraph "Data Layer"
            postgres[("PostgreSQL\nDatabase")]
            redis[("Redis\nQueue")]
        end

        ingress &lt;--&gt; cert
        ingress --&gt; n8n
        n8n &lt;--&gt; postgres
        n8n &lt;--&gt; redis
        worker1 &lt;--&gt; postgres
        worker1 &lt;--&gt; redis
        worker2 &lt;--&gt; postgres
        worker2 &lt;--&gt; redis
        workern &lt;--&gt; postgres
        workern &lt;--&gt; redis
    end

    client[("External\nClient")] &lt;--&gt; ingress
</code></pre>
<h3 id="heading-layer-breakdown">Layer Breakdown</h3>
<h4 id="heading-1-data-layer">1. Data Layer</h4>
<ul>
<li><p><strong>PostgreSQL</strong>: Stores workflow definitions, credentials, and execution records</p>
<ul>
<li>Uses persistent volume for data durability</li>
<li>Configured with appropriate resources for performance</li>
<li>Non-root user for n8n database access</li>
</ul>
</li>
<li><p><strong>Redis</strong>: Manages workflow execution queue</p>
<ul>
<li>Enables distributed execution across workers</li>
<li>Tracks execution state and enables retries</li>
<li>Provides inter-process communication</li>
</ul>
</li>
</ul>
<h4 id="heading-2-application-layer">2. Application Layer</h4>
<ul>
<li><p><strong>n8n Main</strong>: Serves the web UI and API</p>
<ul>
<li>Handles workflow editing and management</li>
<li>Processes webhook triggers</li>
<li>Enqueues workflows for execution</li>
</ul>
</li>
<li><p><strong>n8n Workers</strong>: Execute workflow tasks</p>
<ul>
<li>Horizontally scalable based on load</li>
<li>Pull jobs from Redis queue</li>
<li>Report execution results back to the database</li>
</ul>
</li>
</ul>
<h4 id="heading-3-external-access-layer">3. External Access Layer</h4>
<ul>
<li><p><strong>NGINX Ingress Controller</strong>: Routes external traffic</p>
<ul>
<li>Terminates SSL/TLS</li>
<li>Handles HTTP routing rules</li>
<li>Load balances incoming requests</li>
</ul>
</li>
<li><p><strong>Cert-Manager</strong>: Manages SSL/TLS certificates</p>
<ul>
<li>Automates certificate issuance from Let's Encrypt</li>
<li>Handles certificate renewal</li>
<li>Configures HTTPS security</li>
</ul>
</li>
</ul>
<h3 id="heading-data-flow">Data Flow</h3>
<ol>
<li>External clients connect to the n8n UI/API via the ingress controller</li>
<li>The n8n main service handles UI interactions and API requests</li>
<li>When a workflow is triggered, it's added to the Redis queue</li>
<li>Worker nodes pick up workflow executions from the queue</li>
<li>Workers execute the workflow and store results in PostgreSQL</li>
<li>The n8n main service displays execution results to users</li>
</ol>
<p>This architecture provides:</p>
<ul>
<li>Clear separation of concerns</li>
<li>Scalability at each layer</li>
<li>High availability through redundancy</li>
<li>Security through proper isolation</li>
</ul>
<p>In the next sections, we'll dive into the implementation details of each component, starting with the AKS cluster setup.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this architecture overview, we've laid out the foundation for a robust n8n deployment on Azure Kubernetes Service. This architecture addresses the core requirements for a production deployment: high availability, scalability, security, and maintainability.</p>
<p>By separating our architecture into distinct layers—data, application, and external access—we've created a modular design that's easier to maintain and troubleshoot. Each component has a specific responsibility, with clear interfaces between them.</p>
<p>In this first article, we've established the foundational principles and architecture for our n8n deployment on AKS. We've explored why n8n is a powerful choice for workflow automation and how Kubernetes provides the ideal platform for a scalable, reliable implementation.</p>
<p>In the next article, we'll turn this architecture into reality by setting up our Azure Kubernetes Service cluster, configuring networking, and preparing the persistent storage foundation. [Continue to Part 2: Setting Up the Foundation]</p>
<hr />
<p>Have you deployed n8n or similar workflow tools in Kubernetes? Share your experience in the comments!</p>
<p><em>This is the end of Part 1. Continue to [Part 2: Setting Up the Foundation] to learn how to set up your Azure Kubernetes Service cluster and prepare the foundation for your n8n deployment.</em></p>
]]></content:encoded></item><item><title><![CDATA[Series Introduction: Building a Production-Ready n8n Workflow Automation Platform on Azure Kubernetes Service]]></title><description><![CDATA[Welcome to This Series!
In this comprehensive 8-part series, I'll take you through the complete journey of deploying n8n workflow automation platform on Azure Kubernetes Service (AKS) using a first principles approach. Rather than just providing conf...]]></description><link>https://nikhilmishra.xyz/series-introduction-building-a-production-ready-n8n-workflow-automation-platform-on-azure-kubernetes-service</link><guid isPermaLink="true">https://nikhilmishra.xyz/series-introduction-building-a-production-ready-n8n-workflow-automation-platform-on-azure-kubernetes-service</guid><category><![CDATA[n8n]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[k8s]]></category><category><![CDATA[Azure]]></category><category><![CDATA[Devops]]></category><category><![CDATA[automation]]></category><category><![CDATA[self-hosted]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Sun, 02 Mar 2025 08:04:13 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740902312840/94c2ed6c-7b5d-46bb-a3b3-ccefd9f998f3.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-welcome-to-this-series">Welcome to This Series!</h2>
<p>In this comprehensive 8-part series, I'll take you through the complete journey of deploying <a target="_blank" href="https://n8n.io/">n8n</a> workflow automation platform on Azure Kubernetes Service (AKS) using a first principles approach. Rather than just providing configuration files, we'll explore the reasoning behind each design decision and build up a production-grade system step by step.</p>
<h2 id="heading-what-youll-learn">What You'll Learn</h2>
<p>By the end of this series, you'll understand:</p>
<ul>
<li>How to design a robust workflow automation architecture from first principles</li>
<li>Best practices for deploying stateful applications on Kubernetes</li>
<li>Implementation of queue-based processing for reliable workflow execution</li>
<li>Security hardening for production deployments</li>
<li>Monitoring, maintenance, and troubleshooting techniques specific to n8n and AKS</li>
<li>Performance and cost optimization strategies</li>
</ul>
<h2 id="heading-who-this-series-is-for">Who This Series Is For</h2>
<p>This series is designed for:</p>
<ul>
<li>DevOps engineers looking to deploy workflow automation tools</li>
<li>Kubernetes administrators seeking stateful application examples</li>
<li>n8n users wanting to scale beyond basic deployments</li>
<li>Cloud architects interested in production-grade Azure implementations</li>
<li>Automation specialists exploring enterprise-ready platforms</li>
</ul>
<p>While some familiarity with Kubernetes concepts and Azure is helpful, I'll explain key concepts along the way to make this accessible to those newer to these technologies.</p>
<h2 id="heading-why-n8n-on-kubernetes">Why n8n on Kubernetes?</h2>
<p>n8n is a powerful workflow automation tool similar to Zapier or Integromat, but with the advantage of being self-hosted. This means you maintain complete control over your data and workflows.</p>
<p>Running n8n on Kubernetes provides several key benefits:</p>
<ul>
<li><strong>High availability</strong>: Ensure your automation workflows run 24/7</li>
<li><strong>Scalability</strong>: Handle growing workflow volume with automatic scaling</li>
<li><strong>Resource efficiency</strong>: Optimize resource allocation based on actual demand</li>
<li><strong>Simplified management</strong>: Standardize deployment, updates, and monitoring</li>
</ul>
<h2 id="heading-series-overview">Series Overview</h2>
<p>Here's what we'll cover in this 8-part journey:</p>
<p><strong>Part 1: Introduction and Architecture Overview</strong>
Understanding the core principles behind a production workflow system and designing a robust architecture.</p>
<p><strong>Part 2: Setting Up the Foundation</strong>
Creating the AKS cluster, configuring namespaces, networking, and persistent storage.</p>
<p><strong>Part 3: Data Layer Implementation</strong>
Deploying PostgreSQL and Redis with security best practices and proper persistence.</p>
<p><strong>Part 4: Application Layer</strong>
Implementing the n8n main service and worker nodes with queue-based processing.</p>
<p><strong>Part 5: External Access and Security</strong>
Configuring ingress, SSL/TLS encryption, and secure access patterns.</p>
<p><strong>Part 6: Monitoring and Optimization</strong>
Setting up monitoring, maintenance procedures, and performance optimization.</p>
<p><strong>Part 7: Troubleshooting Guide</strong>
Comprehensive approach to diagnosing and resolving common issues.</p>
<p><strong>Part 8: Conclusion and Next Steps</strong>
Summarizing our implementation, reviewing benefits, and exploring advanced enhancements.</p>
<h2 id="heading-lets-get-started">Let's Get Started!</h2>
<p>Join me on this journey as we build a production-grade n8n deployment from the ground up. Each article in the series will build upon the previous ones, creating a complete system that is scalable, secure, and maintainable.</p>
]]></content:encoded></item><item><title><![CDATA[Building a Production-Ready Static Website with AWS EC2, Nginx, and Cloudflare]]></title><description><![CDATA[Introduction
In today's digital landscape, deploying static websites efficiently, securely, and cost-effectively is a fundamental skill for developers. In this technical deep dive, I'll walk you through creating a production-grade static website host...]]></description><link>https://nikhilmishra.xyz/building-a-production-ready-static-website-with-aws-ec2-nginx-and-cloudflare</link><guid isPermaLink="true">https://nikhilmishra.xyz/building-a-production-ready-static-website-with-aws-ec2-nginx-and-cloudflare</guid><category><![CDATA[AWS]]></category><category><![CDATA[nginx]]></category><category><![CDATA[cloudflare]]></category><category><![CDATA[Devops]]></category><category><![CDATA[Web Development]]></category><category><![CDATA[ec2]]></category><category><![CDATA[Let's Encrypt]]></category><category><![CDATA[deployment]]></category><category><![CDATA[Security]]></category><category><![CDATA[Performance Optimization]]></category><category><![CDATA[Linux]]></category><category><![CDATA[Bash]]></category><category><![CDATA[bash script]]></category><category><![CDATA[infrastructure]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Fri, 28 Feb 2025 17:38:11 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740763989118/e2d168b0-7fa5-4077-b801-091be27191ee.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In today's digital landscape, deploying static websites efficiently, securely, and cost-effectively is a fundamental skill for developers. In this technical deep dive, I'll walk you through creating a production-grade static website hosting solution using AWS EC2, Nginx, Cloudflare, and Let's Encrypt. This project showcases a robust deployment pipeline that ensures reliability, security, and performance.</p>
<p><img src="https://images.unsplash.com/photo-1517694712202-14dd9538aa97?ixlib=rb-1.2.1&amp;auto=format&amp;fit=crop&amp;w=1350&amp;q=80" alt="Static Website Server Banner" /></p>
<h2 id="heading-project-overview">Project Overview</h2>
<p>We're building a complete static site hosting solution with these core components:</p>
<ul>
<li><strong>AWS EC2 Instance</strong> (Amazon Linux 2023) - Our cloud server</li>
<li><strong>Nginx</strong> - Our high-performance web server</li>
<li><strong>Let's Encrypt</strong> - For free, automated SSL/TLS certificates</li>
<li><strong>Cloudflare</strong> - For DNS management, CDN, and additional security layers</li>
<li><strong>Custom Bash Deployment Script</strong> - For automated, secure deployments</li>
</ul>
<h2 id="heading-technical-architecture">Technical Architecture</h2>
<p>Let's begin by understanding the system architecture:</p>
<pre><code class="lang-mermaid">graph TD
    A[Client Browser] --&gt;|HTTPS Request| B[Cloudflare DNS]
    B --&gt;|HTTP Request| C[AWS EC2 Instance]
    C --&gt;|Serves| D[Nginx Web Server]
    D --&gt;|Hosts| E[Static Website Files]
    F[Local Development Environment] --&gt;|Deploy via SCP| C

    subgraph "AWS Cloud"
        C
        D
        E
    end

    subgraph "Cloudflare"
        B --&gt;|SSL Termination| B1[Edge Server]
        B1 --&gt;|Cache| B2[CDN]
    end

    classDef aws fill:#FF9900,stroke:#232F3E,color:white;
    classDef cloudflare fill:#F6821F,stroke:#232F3E,color:white;
    classDef nginx fill:#009639,stroke:#232F3E,color:white;
    class C,E aws;
    class B,B1,B2 cloudflare;
    class D nginx;
</code></pre>
<p>This diagram illustrates how client requests flow through our infrastructure:</p>
<ol>
<li>The client's browser sends an HTTPS request to our domain</li>
<li>Cloudflare handles DNS resolution and SSL termination at its edge servers</li>
<li>The request is forwarded to our AWS EC2 instance</li>
<li>Nginx processes the request and serves the static files</li>
<li>The response flows back through the same path to the client</li>
</ol>
<p>The architecture leverages Cloudflare's global CDN for improved performance and DDoS protection, while keeping our server setup lean and focused.</p>
<h2 id="heading-automated-deployment-pipeline">Automated Deployment Pipeline</h2>
<p>For consistent and reliable deployments, we've created a robust bash script that uses SCP to securely transfer files from your local environment to the server:</p>
<pre><code class="lang-mermaid">flowchart TD
    A[Start Deployment] --&gt; B{SSH Key Exists?}
    B --&gt;|No| C[Error: SSH Key Not Found]
    B --&gt;|Yes| D[Test SSH Connection]
    D --&gt;|Failed| E[Error: SSH Connection Failed]
    D --&gt;|Success| F[Create Temporary Directory on Server]
    F --&gt;|Success| G[Copy Files to Temporary Directory]
    G --&gt;|Failed| H[Clean Up &amp; Exit]
    G --&gt;|Success| I[Move Files to Final Location]
    I --&gt;|Failed| J[Clean Up &amp; Exit]
    I --&gt;|Success| K[Deployment Complete]

    style A fill:#4CAF50,stroke:#006400,color:white
    style K fill:#4CAF50,stroke:#006400,color:white
    style C fill:#FF5252,stroke:#B71C1C,color:white
    style E fill:#FF5252,stroke:#B71C1C,color:white
    style H fill:#FF5252,stroke:#B71C1C,color:white
    style J fill:#FF5252,stroke:#B71C1C,color:white
</code></pre>
<p>Here's the deployment script:</p>
<pre><code class="lang-bash"><span class="hljs-meta">#!/bin/bash</span>

<span class="hljs-comment">########</span>
<span class="hljs-comment"># Author: Your Name</span>
<span class="hljs-comment"># Date: 2025-02-28</span>
<span class="hljs-comment">#</span>
<span class="hljs-comment"># Version: v1.2</span>
<span class="hljs-comment">#</span>
<span class="hljs-comment"># Static Site Server Deployment Script</span>
<span class="hljs-comment">#</span>
<span class="hljs-comment"># This script uses scp to sync your static site from your local machine to a remote server.</span>
<span class="hljs-comment">########</span>

<span class="hljs-comment"># Enable debug mode</span>
<span class="hljs-built_in">set</span> -x

<span class="hljs-comment"># Change to script directory</span>
<span class="hljs-built_in">cd</span> <span class="hljs-string">"<span class="hljs-subst">$(dirname <span class="hljs-string">"<span class="hljs-variable">$0</span>"</span>)</span>"</span> || <span class="hljs-built_in">exit</span>

<span class="hljs-comment"># Remote server details</span>
REMOTE_USER=<span class="hljs-string">"ec2-user"</span>
REMOTE_HOST=<span class="hljs-string">"your-ec2-ip-address"</span>
REMOTE_DIR=<span class="hljs-string">"/usr/share/nginx/html"</span>

<span class="hljs-comment"># SSH key path</span>
SSH_KEY=<span class="hljs-string">"<span class="hljs-variable">$HOME</span>/.ssh/your_key.pem"</span>

<span class="hljs-comment"># Check if SSH key exists</span>
<span class="hljs-keyword">if</span> [ ! -f <span class="hljs-string">"<span class="hljs-variable">$SSH_KEY</span>"</span> ]; <span class="hljs-keyword">then</span>
    <span class="hljs-built_in">echo</span> <span class="hljs-string">"Error: SSH key not found at <span class="hljs-variable">$SSH_KEY</span>"</span>
    <span class="hljs-built_in">exit</span> 1
<span class="hljs-keyword">fi</span>

<span class="hljs-comment"># Test SSH connection first</span>
<span class="hljs-built_in">echo</span> <span class="hljs-string">"Testing SSH connection..."</span>
<span class="hljs-keyword">if</span> ! ssh -i <span class="hljs-string">"<span class="hljs-variable">$SSH_KEY</span>"</span> -o BatchMode=yes -o ConnectTimeout=5 <span class="hljs-string">"<span class="hljs-variable">$REMOTE_USER</span>@<span class="hljs-variable">$REMOTE_HOST</span>"</span> <span class="hljs-built_in">echo</span> <span class="hljs-string">"SSH connection successful"</span>; <span class="hljs-keyword">then</span>
    <span class="hljs-built_in">echo</span> <span class="hljs-string">"Error: SSH connection failed. Please check your SSH key and server configuration."</span>
    <span class="hljs-built_in">exit</span> 1
<span class="hljs-keyword">fi</span>

<span class="hljs-comment"># Create a temporary directory on the remote server</span>
<span class="hljs-built_in">echo</span> <span class="hljs-string">"Creating temporary directory on remote server..."</span>
TEMP_DIR=<span class="hljs-string">"/tmp/static-site-<span class="hljs-subst">$(date +%s)</span>"</span>
<span class="hljs-keyword">if</span> ! ssh -i <span class="hljs-string">"<span class="hljs-variable">$SSH_KEY</span>"</span> <span class="hljs-string">"<span class="hljs-variable">$REMOTE_USER</span>@<span class="hljs-variable">$REMOTE_HOST</span>"</span> <span class="hljs-string">"mkdir -p <span class="hljs-variable">$TEMP_DIR</span>"</span>; <span class="hljs-keyword">then</span>
    <span class="hljs-built_in">echo</span> <span class="hljs-string">"Error: Failed to create temporary directory"</span>
    <span class="hljs-built_in">exit</span> 1
<span class="hljs-keyword">fi</span>

<span class="hljs-comment"># Copy files to temporary directory</span>
<span class="hljs-built_in">echo</span> <span class="hljs-string">"Copying files to remote server..."</span>
<span class="hljs-keyword">if</span> ! scp -i <span class="hljs-string">"<span class="hljs-variable">$SSH_KEY</span>"</span> -r ./* <span class="hljs-string">"<span class="hljs-variable">$REMOTE_USER</span>@<span class="hljs-variable">$REMOTE_HOST</span>:<span class="hljs-variable">$TEMP_DIR</span>/"</span>; <span class="hljs-keyword">then</span>
    <span class="hljs-built_in">echo</span> <span class="hljs-string">"Error: Failed to copy files"</span>
    ssh -i <span class="hljs-string">"<span class="hljs-variable">$SSH_KEY</span>"</span> <span class="hljs-string">"<span class="hljs-variable">$REMOTE_USER</span>@<span class="hljs-variable">$REMOTE_HOST</span>"</span> <span class="hljs-string">"rm -rf <span class="hljs-variable">$TEMP_DIR</span>"</span>
    <span class="hljs-built_in">exit</span> 1
<span class="hljs-keyword">fi</span>

<span class="hljs-comment"># Move files to final location</span>
<span class="hljs-built_in">echo</span> <span class="hljs-string">"Moving files to final location..."</span>
<span class="hljs-keyword">if</span> ! ssh -i <span class="hljs-string">"<span class="hljs-variable">$SSH_KEY</span>"</span> <span class="hljs-string">"<span class="hljs-variable">$REMOTE_USER</span>@<span class="hljs-variable">$REMOTE_HOST</span>"</span> <span class="hljs-string">"sudo rm -rf <span class="hljs-variable">$REMOTE_DIR</span>/* &amp;&amp; sudo cp -r <span class="hljs-variable">$TEMP_DIR</span>/* <span class="hljs-variable">$REMOTE_DIR</span>/ &amp;&amp; sudo chown -R nginx:nginx <span class="hljs-variable">$REMOTE_DIR</span> &amp;&amp; sudo chmod -R 755 <span class="hljs-variable">$REMOTE_DIR</span> &amp;&amp; rm -rf <span class="hljs-variable">$TEMP_DIR</span>"</span>; <span class="hljs-keyword">then</span>
    <span class="hljs-built_in">echo</span> <span class="hljs-string">"Error: Failed to move files to final location"</span>
    ssh -i <span class="hljs-string">"<span class="hljs-variable">$SSH_KEY</span>"</span> <span class="hljs-string">"<span class="hljs-variable">$REMOTE_USER</span>@<span class="hljs-variable">$REMOTE_HOST</span>"</span> <span class="hljs-string">"rm -rf <span class="hljs-variable">$TEMP_DIR</span>"</span>
    <span class="hljs-built_in">exit</span> 1
<span class="hljs-keyword">fi</span>

<span class="hljs-built_in">echo</span> <span class="hljs-string">"Deployment completed successfully!"</span>
</code></pre>
<p>This script includes several best practices:</p>
<ul>
<li>SSH connection validation before attempting deployment</li>
<li>Temporary directory usage for atomic deployments</li>
<li>Proper error handling with cleanup on failure</li>
<li>Appropriate file permissions for security</li>
</ul>
<h2 id="heading-cloudflare-integration-dns-and-security">Cloudflare Integration: DNS and Security</h2>
<p>Cloudflare provides an additional layer of protection and performance optimization:</p>
<pre><code class="lang-mermaid">sequenceDiagram
    participant User as User
    participant Browser as Browser
    participant Cloudflare as Cloudflare
    participant EC2 as EC2 Instance
    participant Nginx as Nginx Server

    User-&gt;&gt;Browser: Enter your-domain.com
    Browser-&gt;&gt;Cloudflare: DNS Resolution
    Cloudflare--&gt;&gt;Browser: IP Address (EC2)
    Browser-&gt;&gt;Cloudflare: HTTPS Request
    Note over Cloudflare: SSL Termination
    Cloudflare-&gt;&gt;EC2: HTTP Request
    EC2-&gt;&gt;Nginx: Forward Request
    Nginx-&gt;&gt;Nginx: Process Request
    Note over Nginx: Find Static Files
    Nginx--&gt;&gt;EC2: Serve HTML/CSS/JS
    EC2--&gt;&gt;Cloudflare: HTTP Response
    Cloudflare--&gt;&gt;Browser: HTTPS Response
    Browser--&gt;&gt;User: Display Content
</code></pre>
<p>To set up Cloudflare:</p>
<ol>
<li>Add your domain to Cloudflare and update nameservers</li>
<li>Create an A record pointing to your EC2 instance's IP address</li>
<li>Configure SSL/TLS settings:<ul>
<li>For maximum security: Full (strict) mode (requires valid SSL cert on server)</li>
<li>For simpler setup: Full mode (works with self-signed certs)</li>
</ul>
</li>
<li>Enable additional security features:<ul>
<li>Always Use HTTPS</li>
<li>HSTS (HTTP Strict Transport Security)</li>
<li>Browser Integrity Check</li>
</ul>
</li>
</ol>
<h2 id="heading-performance-optimization">Performance Optimization</h2>
<p>To ensure optimal performance, we implemented several optimizations:</p>
<ol>
<li><p><strong>Nginx Configuration Tuning</strong>:</p>
<ul>
<li>Gzip compression for reduced bandwidth usage</li>
<li>Optimized worker processes based on CPU cores</li>
<li>File cache settings for frequently accessed content</li>
</ul>
</li>
<li><p><strong>Cloudflare Performance Settings</strong>:</p>
<ul>
<li>Auto Minify for HTML, CSS, and JavaScript</li>
<li>Brotli compression (more efficient than gzip)</li>
<li>Rocket Loader for asynchronous JavaScript loading</li>
</ul>
</li>
<li><p><strong>Static Asset Optimization</strong>:</p>
<ul>
<li>WebP image format for better compression</li>
<li>Defer loading of non-critical resources</li>
<li>Cache control headers for optimal browser caching</li>
</ul>
</li>
</ol>
<h2 id="heading-monitoring-and-maintenance">Monitoring and Maintenance</h2>
<p>For ongoing maintenance, we set up:</p>
<ol>
<li><p><strong>Log Rotation</strong>:</p>
<pre><code class="lang-bash">sudo logrotate -d /etc/logrotate.d/nginx
</code></pre>
</li>
<li><p><strong>Simple Health Check Script</strong>:</p>
<pre><code class="lang-bash"><span class="hljs-meta">#!/bin/bash</span>
<span class="hljs-comment"># health_check.sh</span>
HTTP_STATUS=$(curl -s -o /dev/null -w <span class="hljs-string">"%{http_code}"</span> https://your-domain.com)
<span class="hljs-keyword">if</span> [ <span class="hljs-string">"<span class="hljs-variable">$HTTP_STATUS</span>"</span> -ne 200 ]; <span class="hljs-keyword">then</span>
<span class="hljs-built_in">echo</span> <span class="hljs-string">"Site is down! HTTP Status: <span class="hljs-variable">$HTTP_STATUS</span>"</span>
<span class="hljs-comment"># Add notification logic here (email, SMS, etc.)</span>
<span class="hljs-keyword">fi</span>
</code></pre>
</li>
<li><p><strong>Basic Performance Monitoring</strong>:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Monitor Nginx process and resource usage</span>
sudo watch -n 5 <span class="hljs-string">"ps aux | grep nginx"</span>
</code></pre>
</li>
</ol>
<h2 id="heading-challenges-and-solutions">Challenges and Solutions</h2>
<h3 id="heading-challenge-1-atomic-deployments">Challenge 1: Atomic Deployments</h3>
<p><strong>Problem</strong>: How to update the site without downtime or showing partial updates?</p>
<p><strong>Solution</strong>: Our deployment script uses a temporary directory approach, only replacing the files after a complete copy is successful. This ensures users never see a partially updated site.</p>
<h3 id="heading-challenge-2-ssl-certificate-management">Challenge 2: SSL Certificate Management</h3>
<p><strong>Problem</strong>: Manual SSL certificate renewal is error-prone and can lead to outages.</p>
<p><strong>Solution</strong>: Automated certificate renewal through certbot's cron job:</p>
<pre><code class="lang-bash"><span class="hljs-built_in">echo</span> <span class="hljs-string">"0 3 * * * root certbot renew --quiet"</span> | sudo tee -a /etc/crontab
</code></pre>
<h3 id="heading-challenge-3-security-hardening">Challenge 3: Security Hardening</h3>
<p><strong>Problem</strong>: Default configurations are often not secure enough for production.</p>
<p><strong>Solution</strong>: Implemented multiple security layers:</p>
<ul>
<li>Strong Nginx security headers</li>
<li>Cloudflare WAF (Web Application Firewall)</li>
<li>Regular security patches via automatic updates</li>
<li>Limited SSH access to specific IP addresses</li>
</ul>
<h2 id="heading-future-enhancements">Future Enhancements</h2>
<p>Looking ahead, several enhancements could further improve this setup:</p>
<ol>
<li><p><strong>CI/CD Pipeline Integration</strong>: Connecting with GitHub Actions or similar CI/CD tools for automated testing and deployment.</p>
</li>
<li><p><strong>Infrastructure as Code</strong>: Converting the manual setup to Terraform or CloudFormation templates.</p>
</li>
<li><p><strong>Advanced Monitoring</strong>: Implementing more comprehensive monitoring with tools like Prometheus and Grafana.</p>
</li>
<li><p><strong>Content Versioning</strong>: Implementing a blue-green deployment strategy for zero-downtime updates with rollback capability.</p>
</li>
</ol>
<h2 id="heading-conclusion">Conclusion</h2>
<p>This project demonstrates how to build a robust, secure, and performant static website hosting infrastructure using AWS EC2, Nginx, Let's Encrypt, and Cloudflare. The architecture provides multiple layers of security, optimized performance, and a streamlined deployment process.</p>
<p>By following these steps, you can create a production-grade hosting environment for static websites that strikes an excellent balance between cost, performance, and security. The modular approach also makes it easy to scale or modify specific components as your needs evolve.</p>
<hr />
<p><em>Want to see the full code? Check out the <a target="_blank" href="https://github.com/kaalpanikh/static-site-server">project repository</a> on GitHub!</em></p>
<p><em>Published on February 28, 2025</em></p>
]]></content:encoded></item><item><title><![CDATA[🔄 Custom Subdomain Forwarding with Cloudflare]]></title><description><![CDATA[🎯 What We're Building




SubdomainForwards To



iam.nikhilmishra.livebento.me/kaalpanikh

links.nikhilmishra.livelinktr.ee/kaalpanikh


🔄 How It Works
%%{init: {"theme": "default"}}%%
graph LR
    A[User] -->|Visits| B[iam.nikhilmishra.live]
    ...]]></description><link>https://nikhilmishra.xyz/custom-subdomain-forwarding-with-cloudflare</link><guid isPermaLink="true">https://nikhilmishra.xyz/custom-subdomain-forwarding-with-cloudflare</guid><category><![CDATA[domain forwarding]]></category><category><![CDATA[pagerules]]></category><category><![CDATA[cloudflare]]></category><category><![CDATA[subdomains]]></category><category><![CDATA[bento]]></category><category><![CDATA[linktree]]></category><category><![CDATA[dns]]></category><category><![CDATA[Link In Bio]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Wed, 19 Feb 2025 09:22:40 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1739956626514/03787f93-f5d5-4d83-9157-08714866ad67.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-what-were-building">🎯 What We're Building</h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Subdomain</td><td>Forwards To</td></tr>
</thead>
<tbody>
<tr>
<td><a target="_blank" href="https://iam.nikhilmishra.live/">iam.nikhilmishra.live</a></td><td><a target="_blank" href="https://bento.me/kaalpanikh">bento.me/kaalpanikh</a></td></tr>
<tr>
<td><a target="_blank" href="https://links.nikhilmishra.live/">links.nikhilmishra.live</a></td><td><a target="_blank" href="https://linktr.ee/kaalpanikh">linktr.ee/kaalpanikh</a></td></tr>
</tbody>
</table>
</div><h2 id="heading-how-it-works">🔄 How It Works</h2>
<pre><code class="lang-mermaid">%%{init: {"theme": "default"}}%%
graph LR
    A[User] --&gt;|Visits| B[iam.nikhilmishra.live]
    A --&gt;|Visits| C[links.nikhilmishra.live]

    subgraph Cloudflare
        B --&gt;|DNS Record| D[CNAME to @]
        C --&gt;|DNS Record| E[CNAME to @]
        D --&gt;|Page Rule| F[301 Redirect]
        E --&gt;|Page Rule| G[301 Redirect]
    end

    F --&gt;|Forwards to| H[bento.me/kaalpanikh]
    G --&gt;|Forwards to| I[linktr.ee/kaalpanikh]

    style Cloudflare fill:#F6821F,stroke:#F6821F,stroke-width:2px
    style H fill:#5D45F9,stroke:#5D45F9,stroke-width:2px
    style I fill:#39E09B,stroke:#39E09B,stroke-width:2px
</code></pre>
<blockquote>
<p><strong>Note:</strong> This setup uses Cloudflare as a workaround since free Bento and Linktree plans don't support custom domains.</p>
</blockquote>
<h2 id="heading-step-by-step-guide">📝 Step-by-Step Guide</h2>
<h3 id="heading-1-create-dns-records">1️⃣ Create DNS Records</h3>
<ol>
<li><p>Log in to your <strong>Cloudflare Dashboard</strong></p>
</li>
<li><p>Select your domain: <code>nikhilmishra.live</code></p>
</li>
<li><p>Go to <strong>DNS</strong> → <strong>Records</strong></p>
</li>
<li><p>Add the following records:</p>
</li>
</ol>
<h4 id="heading-for-bento-profile">For Bento Profile</h4>
<pre><code class="lang-plaintext">Type:   CNAME
Name:   iam
Target: @
Proxy:  ✅ Enabled (Orange Cloud)
</code></pre>
<h4 id="heading-for-linktree-profile">For Linktree Profile</h4>
<pre><code class="lang-plaintext">Type:   CNAME
Name:   links
Target: @
Proxy:  ✅ Enabled (Orange Cloud)
</code></pre>
<h3 id="heading-2-set-up-page-rules">2️⃣ Set Up Page Rules</h3>
<ol>
<li><p>Navigate to <strong>Rules</strong> → <strong>Page Rules</strong></p>
</li>
<li><p>Create two page rules:</p>
</li>
</ol>
<h4 id="heading-bento-redirect">Bento Redirect</h4>
<pre><code class="lang-plaintext">URL Pattern: https://iam.nikhilmishra.live/*
Forward to: https://bento.me/kaalpanikh
Status:     301 (Permanent Redirect)
</code></pre>
<h4 id="heading-linktree-redirect">Linktree Redirect</h4>
<pre><code class="lang-plaintext">URL Pattern: https://links.nikhilmishra.live/*
Forward to: https://linktr.ee/kaalpanikh
Status:     301 (Permanent Redirect)
</code></pre>
<h3 id="heading-3-verify-setup">3️⃣ Verify Setup</h3>
<p>After DNS propagation (usually 5-10 minutes):</p>
<ol>
<li><p>Visit <a target="_blank" href="https://iam.nikhilmishra.live/">iam.nikhilmishra.live</a></p>
<ul>
<li>Should redirect to your Bento profile.</li>
</ul>
</li>
<li><p>Visit <a target="_blank" href="https://links.nikhilmishra.live/">links.nikhilmishra.live</a></p>
<ul>
<li>Should redirect to your Linktree profile.</li>
</ul>
</li>
</ol>
<h2 id="heading-troubleshooting">⚠️ Troubleshooting</h2>
<p>If redirects aren't working:</p>
<ol>
<li><p>Check if Cloudflare proxy is enabled (orange cloud).</p>
</li>
<li><p>Verify page rules are in the correct order.</p>
</li>
<li><p>Clear your browser cache.</p>
</li>
<li><p>Wait a few more minutes for DNS propagation.</p>
</li>
</ol>
<h2 id="heading-useful-links">🔗 Useful Links</h2>
<ul>
<li><p><a target="_blank" href="https://dash.cloudflare.com/">Cloudflare Dashboard</a></p>
</li>
<li><p><a target="_blank" href="https://bento.me/kaalpanikh">Bento Profile</a></p>
</li>
<li><p><a target="_blank" href="https://linktr.ee/kaalpanikh">Linktree Profile</a></p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[I escalated my app to the cloud !]]></title><description><![CDATA[Context :
I deployed a web app locally, and now I want to move it to the cloud for scalability, reliability, ease of management, and automation.
Here, I am using IaaS.
Major Services used will be :

ELB for Nginx

EC2 instead of local vms

Route 53 f...]]></description><link>https://nikhilmishra.xyz/i-escalated-my-app-to-the-cloud</link><guid isPermaLink="true">https://nikhilmishra.xyz/i-escalated-my-app-to-the-cloud</guid><category><![CDATA[AWS]]></category><category><![CDATA[Cloud]]></category><category><![CDATA[Cloud Computing]]></category><category><![CDATA[ec2]]></category><category><![CDATA[ACM]]></category><category><![CDATA[route53]]></category><category><![CDATA[godaddy]]></category><category><![CDATA[Load Balancer]]></category><category><![CDATA[autoscaling]]></category><category><![CDATA[i am]]></category><dc:creator><![CDATA[Nikhil Mishra]]></dc:creator><pubDate>Thu, 11 Jul 2024 12:45:33 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1720694526119/54e02dab-6a1c-4d48-ac42-7385cffa3bfe.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-context">Context :</h2>
<p>I deployed a web app locally, and now I want to move it to the cloud for <mark>scalability, reliability, ease of management, and automation.</mark></p>
<p>Here, I am using <mark>IaaS.</mark></p>
<h3 id="heading-major-services-used-will-be"><strong>Major Services used will be</strong> :</h3>
<ul>
<li><p>ELB for Nginx</p>
</li>
<li><p>EC2 instead of local vms</p>
</li>
<li><p>Route 53 for DNS</p>
</li>
<li><p>S3 for artifact storage</p>
</li>
<li><p>Auto Scaling Group</p>
</li>
<li><p>IAM</p>
</li>
<li><p>ACM</p>
</li>
</ul>
<h2 id="heading-the-architecture-changes">The Architecture changes :</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720697331498/10cf0e6e-af3f-4f30-b144-1eb6c312ac02.png" alt="The Architecture" class="image--center mx-auto" /></p>
<blockquote>
<ol>
<li><p>The user will log in through an endpoint hosted on GoDaddy.</p>
</li>
<li><p>They will access the endpoint via HTTPS, with the certificate managed by ACM.</p>
</li>
<li><p>The user will connect to the ELB endpoint, which only permits HTTPS traffic.</p>
</li>
<li><p>The ELB will then route the user to the application server.</p>
</li>
<li><p>The application server consists of Tomcat instances, managed by an autoscaling group that adjusts based on traffic and only allows traffic on port 8080.</p>
</li>
<li><p>The application server requires access to backend servers, managed by a Route 53 private hosted zone.</p>
</li>
<li><p>These backend servers, which include MySQL, RabbitMQ, and Memcache, are in a separate security group and will only allow traffic on their specific ports.</p>
</li>
</ol>
</blockquote>
<p><strong>I have a certificate ready in <mark>ACM</mark> issued by <mark>GoDaddy</mark>, which I obtained by requesting and adding a CNAME record.</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720698065731/1a3161a9-1a60-463b-818b-c0b936bf0f55.png" alt="godaddy cname record" class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720698097088/d9bd9beb-1715-4443-8a9e-fc8a41f483d8.png" alt="acm" class="image--center mx-auto" /></p>
<h2 id="heading-now-i-made-3-security-groups"><strong>Now I made 3 security groups:</strong></h2>
<p>For the load balancer, allowing HTTP and HTTPS traffic.</p>
<p>For the app server, allowing port 8080 from the load balancer's security group. Also added SSH and port 8080 access from my IP.</p>
<p>For the backend, allowing port 3306 for MySQL, 11211 for Memcached, and 5672 for RabbitMQ from the app server's security group. Also allowed all traffic within the group and SSH for validation.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720698219004/d092cbde-577e-4d9a-a9a4-63fcf8f2ff75.png" alt="security group" class="image--center mx-auto" /></p>
<h2 id="heading-i-also-have-my-key-ready"><strong>I also have my key ready</strong> :</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720698346197/8d84c2b3-7541-4716-bc76-269249aaf0b5.png" alt="key" class="image--center mx-auto" /></p>
<p>Cloned the <mark>repo</mark> and got the scripts for launching instances</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720698495703/782c120d-d642-47b2-ab81-0e990c624d7f.png" alt="cloned repo" class="image--center mx-auto" /></p>
<h2 id="heading-instances">Instances :</h2>
<p>Launched DB, memcache, and RabbitMQ with Amazon Linux.</p>
<p>Launched the app server with Ubuntu.</p>
<p>Validated that all services and scripts are working</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720698573988/82c5253e-10fb-4af2-8e91-b9d08fe1ae47.png" alt="ec2 instances" class="image--center mx-auto" /></p>
<h3 id="heading-created-a-hosted-zone-with-simple-routing-rules-by-adding-a-records-for-backend-servers-using-their-private-ips">Created a <strong><mark>hosted zone</mark></strong> with simple routing rules by adding A records for backend servers using their private IPs.</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720699137254/3f8068d2-efe7-4927-91bd-f3bc36fb6870.png" alt="hosted zone" class="image--center mx-auto" /></p>
<h2 id="heading-now-you-need-to-build-and-upload-the-artifact"><mark>Now, you need to build and upload the artifact.</mark></h2>
<p>Update the <a target="_blank" href="http://application.properties">application.properties</a>file with the correct server routes. Built the arctifact locally using JDK and Maven.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720699319360/208425a7-66ba-4a28-8b4e-cdbea7a96fe9.png" alt="build" class="image--center mx-auto" /></p>
<p>create an <strong><mark>IAM user</mark></strong> to upload the artifact by creating an S3 bucket and pushing the artifact there.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720700421655/df459b1b-9405-4227-8529-00a245f53943.png" alt="iam user" class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720700519405/cb23bb6b-ef21-478b-a37c-394a2f18e1db.png" alt="aws config" class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720700644074/1628fe5f-d70e-457f-9476-6e7b4775adbb.png" alt="s3" class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720700703141/9d0ee4f8-856c-4110-9f74-af26f495f895.png" alt="artifact" class="image--center mx-auto" /></p>
<p>Created an <strong><mark>IAM role</mark></strong> and gave access to the app server to download the artifact and start the Tomcat service for our app.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720700759157/43a930e4-63a1-4b02-b6e0-92bcd45de177.png" alt="i am role" class="image--center mx-auto" /></p>
<h2 id="heading-load-balancer">Load Balancer :</h2>
<p>Now, for setting up a <strong><mark>load balancer</mark></strong>, I created a <strong><mark>target group</mark></strong> and added our instance that listens on port 8080 as the target. I also set up a health check at <code>/login</code> on port 8080.</p>
<p>I then created an internet-facing application load balancer that is available on all subnets. It listens for both HTTP and HTTPS traffic by adding the target group and certificate.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720701012436/e3cfc9c6-a3cb-46e2-a9a6-ab09c197c882.png" alt="load balancer" class="image--center mx-auto" /></p>
<p>now, we have our application up and running when we add the elb endpoint to dns as a cname record</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720701065528/7d8c0ef9-de33-4669-86a2-332bbdad7798.png" alt="loadbalancer endpoint" class="image--center mx-auto" /></p>
<h2 id="heading-heres-our-working-app-with-all-services-validated"><strong><mark>Here's our working app with all services validated</mark></strong></h2>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/H1aIZtZ50sE">https://youtu.be/H1aIZtZ50sE</a></div>
<p> </p>
<h2 id="heading-auto-scaling">Auto Scaling :</h2>
<p>I wanted to make my app ready to scale up based on traffic. To add an auto-scaling group, I first created an <strong><mark>AMI</mark></strong> of my app instance.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720701315267/ee081137-c8b8-4ba0-a210-bb49142312bc.png" alt="ami" class="image--center mx-auto" /></p>
<p>created a <mark>launch template</mark> using the <mark>ami</mark></p>
<p>Then, I created an <strong>auto-scaling group</strong> with desired triggers and added alarms using <mark>SNS</mark>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720701501670/3f9f6a6e-2dc1-4b49-a0d1-8bb432f3d00d.png" alt="auto scaling group" class="image--center mx-auto" /></p>
<h2 id="heading-and-we-are-done-here-is-our-webapp-working-on-cloud"><strong><em><mark>And we are done, here is our webapp working, on cloud :</mark></em></strong></h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720701577904/1fb0a39a-0d4d-497a-a0a8-4437e47c98ac.png" alt="completed setup" class="image--center mx-auto" /></p>
<blockquote>
<p>If you like what Im working on, leave some feedback in comments, a like would be great also subscribe to my newsletter for more such blog delivered straight to your inbox.</p>
</blockquote>
<p>Thank you !</p>
]]></content:encoded></item></channel></rss>