🚧 Work in Progress
This page is currently being updated as and when Anant gets time. Once it is fully updated, this message will be removed.
This page is for “AI PC” - dedicated AI/ML machine with dual RTX 3060 configuration.
Commentary
This is my dedicated AI machine, built specifically to handle AI/ML workloads that previously required workarounds on my older Dell R720xd server due to AVX2 limitations. The system runs on Linux with NVIDIA drivers configured for optimal GPU utilization.
Dual GPU Setup: Two MSI RTX 3060 cards work together to provide significant computational power for AI workloads, offering 24GB total VRAM and 7168 CUDA cores for parallel processing.
Configuration Requirements: Secure boot was disabled to allow easy loading of NVIDIA kernel modules, and dummy HDMI plugs are installed in both cards to ensure proper driver initialization and prevent display-related issues in headless operation.
Purpose: This machine serves as a dedicated AI worker, replacing the need for CPU-based AI solutions. The system is designed to be a pure AI processing unit, accessible remotely via Tailscale for seamless integration with other machines in my setup.
Usage Philosophy: Keep this system focused solely on AI workloads - no general computing tasks, just pure AI processing power for models, training, and inference tasks.
Technical Specs
System Configuration
CPU: Intel Core i7-14700F
Motherboard: MSI motherboard
RAM: 128GB system memory
Storage: Samsung M.2 SATA SSDs
Operating System: Linux with NVIDIA drivers
Graphics Cards
GPU: 2x MSI RTX 3060
Memory per Card: 12GB GDDR6
Total VRAM: 24GB GDDR6
Total CUDA Cores: 7168 (3584 per card)
Total Tensor Cores: 224 (112 per card)
Power Requirements: ~340W TDP total
Dummy HDMI Plugs: Installed in both cards for headless operation
Single RTX 3060 Specifications
Memory Bus: 192-bit
Base Clock: 1320 MHz
Boost Clock: 1777 MHz
RT Cores: 28 (2nd Gen)
Tensor Cores: 112 (3rd Gen)
Power: 170W TDP
Connectors: 3x DisplayPort 1.4a, 1x HDMI 2.1
Useful Applications
AI Model Management
Ollama - Primary AI model management and inference
CUDA - Parallel computing framework
TensorFlow - Machine learning platform
PyTorch - Deep learning framework
Hugging Face - AI model repository and tools
Remote Access and Networking
Tailscale - Secure networking for remote access from other machines
SSH - Remote command line access
Network Services - Expose AI services to other devices in the network
System Management
NVIDIA Drivers - Linux kernel modules for GPU access
nvidia-smi - GPU monitoring and management
CUDA Toolkit - Development environment
Docker - Containerized AI workloads
Linux System Tools - Basic system monitoring and management
Tips and Tricks
Linux Configuration
Secure Boot: Disable to allow easy loading of NVIDIA kernel modules
NVIDIA Drivers: Install and configure Linux drivers for dual GPU support
Dummy HDMI Plugs: Install in both cards to ensure proper driver initialization
Kernel Modules: Ensure proper loading of nvidia and nvidia_uvm modules
Dual GPU Management
nvidia-smi: Monitor both GPUs and their utilization
CUDA_VISIBLE_DEVICES: Control which GPUs are accessible to applications
Memory Management: Monitor VRAM usage across both cards
Temperature Monitoring: Keep track of thermal performance
AI Workload Optimization
Multi-GPU Training: Configure frameworks to utilize both cards
Memory Distribution: Balance workloads across 24GB total VRAM
Power Management: Monitor total power consumption (~340W)
Driver Updates: Keep NVIDIA drivers updated for optimal performance
Dedicated AI Worker Setup
Ollama Configuration: Set up for optimal model loading and inference
Tailscale Integration: Configure for secure remote access from other machines
Service Exposure: Make AI services available to network devices
Resource Dedication: Keep system focused solely on AI tasks
Remote Management: Access and control via Tailscale from other machines