Preguntas de Entrevista sobre Redes Distribuidas: CAP, Consenso y Diseno de Sistemas (2026)

Published: 1/15/2026 · Updated: 1/15/2026

Preguntas de Entrevista sobre Redes Distribuidas: CAP, Consenso y Diseno de Sistemas (2026)

Las redes distribuidas son fundamentales para los sistemas modernos, desde microservicios hasta blockchain y CDNs globales. Comprender estos conceptos es crucial para puestos de ingenieria senior. Esta guia cubre las preguntas de entrevista mas comunes con respuestas detalladas y ejemplos practicos.

Fundamentos

1. Que es un sistema distribuido?

Respuesta: Un sistema distribuido es una coleccion de computadoras independientes que aparecen ante los usuarios como un unico sistema coherente. Caracteristicas clave:

Concurrencia: Los componentes se ejecutan simultaneamente
Sin reloj global: Los nodos tienen relojes independientes
Fallos independientes: Los componentes pueden fallar sin afectar a otros
Paso de mensajes: Comunicacion a traves de mensajes de red

Ejemplos: Google Search, streaming de Netflix, red Bitcoin.

2. Explica el teorema CAP

Respuesta: El teorema CAP establece que un sistema distribuido solo puede garantizar dos de tres propiedades:

Consistencia: Todos los nodos ven los mismos datos al mismo tiempo
Disponibilidad: Cada solicitud recibe una respuesta
Tolerancia a particiones: El sistema continua operando a pesar de fallos de red

En la practica, la tolerancia a particiones es requerida (las redes fallan), asi que eliges entre CP (consistencia) o AP (disponibilidad):

CP Systems: MongoDB, HBase, Redis Cluster
- Sacrifice availability during partitions
- Strong consistency guarantees

AP Systems: Cassandra, DynamoDB, CouchDB
- Remain available during partitions
- Eventually consistent

3. Que es la consistencia eventual?

Respuesta: La consistencia eventual garantiza que si no se realizan nuevas actualizaciones, todas las replicas eventualmente convergeran al mismo valor. Es una garantia mas debil que la consistencia fuerte pero permite mayor disponibilidad.

// Example: DNS propagation
// Update takes time to propagate globally
// Different users may see different values temporarily
// Eventually, all DNS servers have the same record

// Conflict resolution strategies:
// 1. Last-write-wins (LWW) - timestamp-based
// 2. Vector clocks - track causality
// 3. CRDTs - mathematically guaranteed convergence

4. Explica la diferencia entre escalado horizontal y vertical

Respuesta:

Aspecto	Escalado Vertical	Escalado Horizontal
Metodo	Agregar recursos a una maquina	Agregar mas maquinas
Costo	Costoso a escala	Hardware de consumo
Limite	Techo de hardware	Teoricamente ilimitado
Complejidad	Simple	Requiere logica de distribucion
Tiempo de inactividad	Usualmente requerido	Posible sin tiempo de inactividad

Consenso y Coordinacion

5. Que es el algoritmo de consenso Raft?

Respuesta: Raft es un algoritmo de consenso para gestionar un log replicado. Esta disenado para ser comprensible (a diferencia de Paxos). Componentes clave:

Eleccion de lider: Un nodo es elegido lider, maneja todas las solicitudes de clientes
Replicacion de log: El lider replica entradas a los seguidores
Seguridad: Solo los nodos con logs actualizados pueden convertirse en lider

Raft states:
1. Follower - Default state, responds to leader
2. Candidate - Requesting votes for leadership
3. Leader - Handles all client requests

Election process:
1. Follower timeout expires
2. Becomes candidate, increments term
3. Requests votes from peers
4. Majority votes = new leader
5. Sends heartbeats to maintain leadership

6. Que es un bloqueo distribuido?

Respuesta: Un bloqueo distribuido asegura que solo un proceso a traves de multiples nodos pueda acceder a un recurso. Desafios de implementacion:

// Redis distributed lock (Redlock algorithm)
const Redis = require('ioredis');

async function acquireLock(redis, key, ttl) {
  const token = crypto.randomUUID();
  const result = await redis.set(key, token, 'NX', 'PX', ttl);
  return result === 'OK' ? token : null;
}

async function releaseLock(redis, key, token) {
  // Lua script for atomic check-and-delete
  const script = `
    if redis.call("get", KEYS[1]) == ARGV[1] then
      return redis.call("del", KEYS[1])
    else
      return 0
    end
  `;
  return redis.eval(script, 1, key, token);
}

// Usage
const token = await acquireLock(redis, 'my-resource', 30000);
if (token) {
  try {
    // Critical section
  } finally {
    await releaseLock(redis, 'my-resource', token);
  }
}

7. Explica los relojes vectoriales

Respuesta: Los relojes vectoriales rastrean la causalidad entre eventos en sistemas distribuidos. Cada nodo mantiene un vector de marcas de tiempo logicas:

// Vector clock example with 3 nodes
// Initial: [0, 0, 0]

// Node A sends message: [1, 0, 0]
// Node B receives and sends: [1, 1, 0]
// Node C receives: [1, 1, 1]

// Comparison rules:
// V1 < V2 if all V1[i] <= V2[i] and at least one V1[i] < V2[i]
// V1 || V2 (concurrent) if neither V1 < V2 nor V2 < V1

class VectorClock {
  constructor(nodeId, numNodes) {
    this.nodeId = nodeId;
    this.clock = new Array(numNodes).fill(0);
  }

  increment() {
    this.clock[this.nodeId]++;
    return [...this.clock];
  }

  update(received) {
    for (let i = 0; i < this.clock.length; i++) {
      this.clock[i] = Math.max(this.clock[i], received[i]);
    }
    this.clock[this.nodeId]++;
  }

  compare(other) {
    let less = false, greater = false;
    for (let i = 0; i < this.clock.length; i++) {
      if (this.clock[i] < other[i]) less = true;
      if (this.clock[i] > other[i]) greater = true;
    }
    if (less && !greater) return -1; // this happened before
    if (greater && !less) return 1;  // other happened before
    return 0; // concurrent
  }
}

Protocolos de Red

8. Compara TCP vs UDP para sistemas distribuidos

Respuesta:

Caracteristica	TCP	UDP
Conexion	Orientado a conexion	Sin conexion
Fiabilidad	Entrega garantizada	Mejor esfuerzo
Ordenamiento	Ordenado	Sin ordenamiento
Velocidad	Mas lento (handshake)	Mas rapido
Casos de uso	HTTP, bases de datos	DNS, streaming, juegos

// When to use each:
// TCP: When you need reliability
// - Database connections
// - File transfers
// - API calls

// UDP: When speed matters more than reliability
// - Real-time gaming
// - Video streaming
// - DNS queries
// - Health checks

9. Que es gRPC y cuando lo usarias?

Respuesta: gRPC es un framework RPC de alto rendimiento que usa Protocol Buffers y HTTP/2:

Protocolo binario: Payloads mas pequenos que JSON
Streaming: Soporte para streaming bidireccional
Generacion de codigo: Clientes/servidores con tipado seguro
HTTP/2: Multiplexacion, compresion de cabeceras

// user.proto
syntax = "proto3";

service UserService {
  rpc GetUser(GetUserRequest) returns (User);
  rpc ListUsers(ListUsersRequest) returns (stream User);
  rpc CreateUser(User) returns (User);
}

message User {
  string id = 1;
  string name = 2;
  string email = 3;
}

message GetUserRequest {
  string id = 1;
}

10. Explica la arquitectura de service mesh

Respuesta: Un service mesh es una capa de infraestructura para la comunicacion servicio a servicio. Componentes:

Plano de datos: Proxies sidecar (Envoy) manejan el trafico
Plano de control: Gestiona la configuracion de los proxies

# Istio example - traffic splitting
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: my-service
spec:
  hosts:
  - my-service
  http:
  - route:
    - destination:
        host: my-service
        subset: v1
      weight: 90
    - destination:
        host: my-service
        subset: v2
      weight: 10

Beneficios: mTLS, observabilidad, gestion de trafico, reintentos sin cambios de codigo.

Tolerancia a Fallos

11. Que es el patron circuit breaker?

Respuesta: El circuit breaker previene fallos en cascada al fallar rapidamente cuando un servicio no esta saludable:

class CircuitBreaker {
  constructor(options) {
    this.failureThreshold = options.failureThreshold || 5;
    this.resetTimeout = options.resetTimeout || 30000;
    this.state = 'CLOSED';
    this.failures = 0;
    this.lastFailure = null;
  }

  async call(fn) {
    if (this.state === 'OPEN') {
      if (Date.now() - this.lastFailure > this.resetTimeout) {
        this.state = 'HALF_OPEN';
      } else {
        throw new Error('Circuit breaker is OPEN');
      }
    }

    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      throw error;
    }
  }

  onSuccess() {
    this.failures = 0;
    this.state = 'CLOSED';
  }

  onFailure() {
    this.failures++;
    this.lastFailure = Date.now();
    if (this.failures >= this.failureThreshold) {
      this.state = 'OPEN';
    }
  }
}

12. Explica el patron bulkhead

Respuesta: Bulkhead aisla componentes para que los fallos no se propaguen en cascada. Como los compartimentos de un barco:

// Thread pool bulkhead
const criticalPool = new ThreadPool({ size: 10 });
const nonCriticalPool = new ThreadPool({ size: 5 });

// If non-critical service exhausts its pool,
// critical operations still have dedicated resources

// Semaphore bulkhead
class Bulkhead {
  constructor(maxConcurrent) {
    this.maxConcurrent = maxConcurrent;
    this.current = 0;
    this.queue = [];
  }

  async execute(fn) {
    if (this.current >= this.maxConcurrent) {
      throw new Error('Bulkhead full');
    }

    this.current++;
    try {
      return await fn();
    } finally {
      this.current--;
    }
  }
}

13. Que es la tolerancia a fallos bizantinos?

Respuesta: BFT maneja nodos que se comportan de manera arbitraria (maliciosamente o debido a errores). El Problema de los Generales Bizantinos:

Requiere 3f+1 nodos para tolerar f fallos bizantinos
Usado en blockchain (PBFT, Tendermint)
Mas costoso que la tolerancia a fallos por caida

PBFT phases:
1. Pre-prepare: Leader proposes value
2. Prepare: Nodes broadcast prepared messages
3. Commit: Nodes commit after 2f+1 prepares
4. Reply: Send result to client

Requires 2f+1 matching messages at each phase

Balanceo de Carga y Enrutamiento

14. Compara algoritmos de balanceo de carga

Respuesta:

// Round Robin - Simple rotation
class RoundRobin {
  constructor(servers) {
    this.servers = servers;
    this.current = 0;
  }

  next() {
    const server = this.servers[this.current];
    this.current = (this.current + 1) % this.servers.length;
    return server;
  }
}

// Weighted Round Robin - Based on capacity
class WeightedRoundRobin {
  constructor(servers) {
    // servers: [{ host: 'a', weight: 3 }, { host: 'b', weight: 1 }]
    this.servers = [];
    for (const s of servers) {
      for (let i = 0; i < s.weight; i++) {
        this.servers.push(s.host);
      }
    }
    this.current = 0;
  }
}

// Least Connections - Route to least busy
class LeastConnections {
  constructor(servers) {
    this.connections = new Map(servers.map(s => [s, 0]));
  }

  next() {
    let min = Infinity, selected;
    for (const [server, count] of this.connections) {
      if (count < min) {
        min = count;
        selected = server;
      }
    }
    return selected;
  }
}

// Consistent Hashing - For caches/sharding
// Minimizes redistribution when nodes change

15. Explica el hashing consistente

Respuesta: El hashing consistente distribuye datos a traves de nodos mientras minimiza la redistribucion cuando se agregan/eliminan nodos:

const crypto = require('crypto');

class ConsistentHash {
  constructor(replicas = 100) {
    this.replicas = replicas;
    this.ring = new Map();
    this.sortedKeys = [];
  }

  hash(key) {
    return crypto.createHash('md5')
      .update(key)
      .digest('hex')
      .substring(0, 8);
  }

  addNode(node) {
    for (let i = 0; i < this.replicas; i++) {
      const hash = this.hash(`${node}:${i}`);
      this.ring.set(hash, node);
      this.sortedKeys.push(hash);
    }
    this.sortedKeys.sort();
  }

  removeNode(node) {
    for (let i = 0; i < this.replicas; i++) {
      const hash = this.hash(`${node}:${i}`);
      this.ring.delete(hash);
      this.sortedKeys = this.sortedKeys.filter(k => k !== hash);
    }
  }

  getNode(key) {
    const hash = this.hash(key);
    for (const nodeHash of this.sortedKeys) {
      if (hash <= nodeHash) {
        return this.ring.get(nodeHash);
      }
    }
    return this.ring.get(this.sortedKeys[0]);
  }
}

Mensajeria y Colas

16. Compara patrones de colas de mensajes

Respuesta:

Punto a punto: Un productor, un consumidor (colas de tareas)
Pub/sub: Un productor, multiples consumidores (difusion de eventos)
Solicitud/respuesta: Patron de mensajeria sincrona

Delivery guarantees:
- At most once: May lose messages (fastest)
- At least once: May duplicate (requires idempotency)
- Exactly once: Most complex, often needs transactions

Technologies:
- RabbitMQ: Traditional message broker, AMQP
- Kafka: Distributed log, high throughput
- Redis Streams: Simple, built into Redis
- AWS SQS: Managed, scalable

17. Que es el patron outbox?

Respuesta: El patron outbox asegura la publicacion confiable de mensajes con transacciones de base de datos:

// Instead of:
await db.transaction(async (tx) => {
  await tx.insert('orders', order);
  await messageQueue.publish('order.created', order); // Can fail!
});

// Use outbox pattern:
await db.transaction(async (tx) => {
  await tx.insert('orders', order);
  await tx.insert('outbox', {
    event_type: 'order.created',
    payload: JSON.stringify(order),
    created_at: new Date()
  });
});

// Separate process polls outbox and publishes
async function processOutbox() {
  const events = await db.query(
    'SELECT * FROM outbox WHERE processed = false LIMIT 100'
  );

  for (const event of events) {
    await messageQueue.publish(event.event_type, event.payload);
    await db.update('outbox', { id: event.id }, { processed: true });
  }
}

Observabilidad

18. Que es el rastreo distribuido?

Respuesta: El rastreo distribuido rastrea solicitudes a traves de los limites de servicios:

// OpenTelemetry example
const { trace } = require('@opentelemetry/api');

const tracer = trace.getTracer('my-service');

async function handleRequest(req) {
  const span = tracer.startSpan('handleRequest');

  try {
    span.setAttribute('user.id', req.userId);

    // Child span for database call
    const dbSpan = tracer.startSpan('database.query', {
      parent: span
    });
    const result = await db.query('...');
    dbSpan.end();

    return result;
  } catch (error) {
    span.recordException(error);
    span.setStatus({ code: SpanStatusCode.ERROR });
    throw error;
  } finally {
    span.end();
  }
}

// Trace context propagation
// W3C Trace Context header: traceparent
// Format: version-trace_id-span_id-flags
// Example: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01

19. Explica los tres pilares de la observabilidad

Respuesta:

Logs: Eventos discretos con contexto
Metricas: Mediciones numericas a lo largo del tiempo
Trazas: Flujo de solicitudes a traves de servicios

Logs: What happened?
- Structured JSON logs
- Correlation IDs
- Log levels (debug, info, warn, error)

Metrics: How is the system performing?
- Counters: request_count
- Gauges: active_connections
- Histograms: request_duration

Traces: Where did the request go?
- Spans with timing
- Parent-child relationships
- Cross-service context

Seguridad

20. Como aseguras la comunicacion servicio a servicio?

Respuesta:

mTLS: Autenticacion TLS mutua
Service mesh: mTLS automatico (Istio, Linkerd)
Claves API: Simple pero menos seguro
JWT: Autenticacion sin estado

# Istio PeerAuthentication for mTLS
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: my-namespace
spec:
  mtls:
    mode: STRICT

Preguntas de Diseno de Sistemas

21. Disena un limitador de tasa distribuido

Respuesta:

// Token bucket with Redis
class DistributedRateLimiter {
  constructor(redis, options) {
    this.redis = redis;
    this.capacity = options.capacity;
    this.refillRate = options.refillRate; // tokens per second
  }

  async isAllowed(key) {
    const now = Date.now();
    const script = `
      local key = KEYS[1]
      local capacity = tonumber(ARGV[1])
      local refillRate = tonumber(ARGV[2])
      local now = tonumber(ARGV[3])

      local bucket = redis.call('HMGET', key, 'tokens', 'lastRefill')
      local tokens = tonumber(bucket[1]) or capacity
      local lastRefill = tonumber(bucket[2]) or now

      -- Refill tokens
      local elapsed = (now - lastRefill) / 1000
      tokens = math.min(capacity, tokens + elapsed * refillRate)

      if tokens >= 1 then
        tokens = tokens - 1
        redis.call('HMSET', key, 'tokens', tokens, 'lastRefill', now)
        redis.call('EXPIRE', key, 60)
        return 1
      else
        return 0
      end
    `;

    return await this.redis.eval(script, 1, key,
      this.capacity, this.refillRate, now);
  }
}

22. Como diseñarias una cache distribuida?

Respuesta: Consideraciones clave:

Particionamiento: Hashing consistente a traves de nodos
Replicacion: Primario-replica para tolerancia a fallos
Desalojo: Basado en LRU, LFU o TTL
Consistencia: Write-through, write-behind o cache-aside

Cache-aside pattern:
1. Check cache
2. If miss, read from database
3. Update cache
4. Return data

Write-through pattern:
1. Write to cache
2. Cache writes to database
3. Ensures consistency but adds latency

Write-behind pattern:
1. Write to cache
2. Cache async writes to database
3. Better performance, eventual consistency

Conclusion

Los sistemas distribuidos son complejos pero comprender estos conceptos fundamentales - consenso, tolerancia a fallos, redes y observabilidad - te preparara para entrevistas de ingenieria senior. Enfocate en los compromisos: raramente hay una solucion perfecta, solo compromisos apropiados para requisitos especificos.