Troubleshooting File Store

This guide helps you diagnose and resolve common issues when working with File Store.

Prerequisites

Before you begin troubleshooting, ensure you have:

  • The evroc CLI installed and configured
  • Access to the VM where the File Store is mounted (for mount-related issues)
  • Appropriate IAM permissions to view File Store resources
  • The NFS client tools installed (for mount troubleshooting)

Debug failed mounts

Check File Store status

Before troubleshooting mount issues, verify the File Store is available:

evroc storage filestore get my-filestore

Look for the Ready condition with status "True". If the status is Provisioning or Pending, wait for it to complete.

Verify network connectivity

Test connectivity to the File Store endpoint:

Security groups may block ICMP ping, so try connecting directly to the NFS port:

# Replace `<filestore-endpoint>` with your File Store's endpoint IP
# Test if the NFS port is reachable (if telnet is available)
telnet <filestore-endpoint> 2049

# Or attempt a connection using /dev/tcp (works in most shells)
timeout 5 bash -c "</dev/tcp/<filestore-endpoint>/2049" && echo "Connection successful"

If connection fails, check:

  • The VM and File Store are in the same VPC
  • Security groups allow traffic between the VM and File Store (TCP port 2049). Note: Default security groups already allow this; you only need to check this if using custom security groups
  • The File Store is in Available status

Check NFS client installation

Verify nfs-common is installed:

# For Ubuntu/Debian
dpkg -l | grep nfs-common

# For RHEL/CentOS
rpm -qa | grep nfs-utils

Check NFS client version:

nfsstat --version

File Store requires NFSv4.1 support. Ensure your client supports it.

Review mount error messages

Common mount errors and solutions:

ErrorCauseSolution
Connection timed outNetwork connectivity issueCheck VPC and security groups
No route to hostVM can't reach File StoreVerify endpoint IP and VPC configuration
Access deniedCustom security group blockingAdd rule for TCP 2049 (default security groups already allow this)
Stale file handleFile Store recovered while files openUnmount and remount
Invalid argumentWrong mount optionsUse nfsvers=4.1

Mount with verbose output

For detailed debugging, mount with verbose flags:

# Replace `<filestore-endpoint>` with your File Store's endpoint IP
sudo mount -t nfs4 -v -o nfsvers=4.1,_netdev <filestore-endpoint>:/ /mnt/filestore

This shows the mount process step by step.

Connection issues

Connection refused

If you see Connection refused:

  1. Verify File Store is running:

    evroc storage filestore get my-filestore
    
  2. Check if the File Store is in the same zone as your VM:

    status:
      placement:
        zone: se-sto-a
    
  3. Verify security groups allow ingress on TCP port 2049 (only needed if using custom security groups; default security groups already allow this)

Connection timeout

If mounts timeout:

  1. Check if there are custom security groups or firewall rules blocking traffic (default security groups already allow NFS traffic)

  2. Verify the File Store endpoint IP hasn't changed (it shouldn't)

Mount hangs indefinitely

If a mount command hangs:

  1. Check if using hard mount option (can cause indefinite hangs on server failure)

  2. Add intr option to allow interruption:

    # Replace `<filestore-endpoint>` with your File Store's endpoint IP
    sudo mount -t nfs4 -o nfsvers=4.1,_netdev,hard,intr <filestore-endpoint>:/ /mnt/filestore
    
  3. If already hung, use Ctrl+C to interrupt

Permission issues

Permission denied on mount

If you see Permission denied when mounting:

  1. Verify you're using sudo:

    sudo mount -t nfs4 ...
    
  2. Check if the mount point exists and is accessible:

    ls -ld /mnt/filestore
    
  3. Ensure the mount point is empty:

    ls -la /mnt/filestore
    

Permission denied accessing files

If mount succeeds but you can't access files:

  1. Check file permissions:

    ls -la /mnt/filestore
    
  2. Verify user/group ownership:

    id
    ls -la /mnt/filestore | head
    
  3. Check if you need root access:

    sudo ls -la /mnt/filestore
    
  4. Verify directory permissions allow access

Stale file handle errors

If you see Stale file handle errors:

Cause: The File Store recovered (pod rescheduled) while you had files open.

Solution:

  1. Close all applications using the File Store

  2. Unmount the File Store:

    sudo umount /mnt/filestore
    
  3. If unmount fails, use lazy unmount:

    sudo umount -l /mnt/filestore
    
  4. Remount:

    # Replace `<filestore-endpoint>` with your File Store's endpoint IP
    sudo mount -t nfs4 -o nfsvers=4.1,_netdev <filestore-endpoint>:/ /mnt/filestore
    

Note: If files were open during the recovery, applications may need to reopen them.

Performance issues

Slow read/write performance

If performance is slower than expected (<100 MB/s):

  1. Check mount options:

    mount | grep filestore
    

    Ensure rsize and wsize are set to 1048576:

    sudo mount -t nfs4 -o nfsvers=4.1,_netdev,rsize=1048576,wsize=1048576 ...
    
  2. Verify same-zone access:

    • File Store performance is guaranteed only within the same zone
    • You may experience variable latency with cross-zone access
  3. Check network latency: Since security groups may block ping, you can estimate latency by timing the mount or file operations. Latency should be <1ms for same-zone access

  4. Monitor File Store metrics:

    • Check if the File Store is under heavy load from other clients
    • Consider distributing load across multiple File Stores
  5. Test throughput:

    # Test write speed
    dd if=/dev/zero of=/mnt/filestore/test-write bs=1M count=1024 oflag=direct
    
    # Test read speed
    dd if=/mnt/filestore/test-write of=/dev/null bs=1M count=1024 iflag=direct
    

Connection drops during heavy load

If connections drop during heavy I/O:

  1. Check if using soft mount (may cause data loss):

    mount | grep filestore
    

    Switch to hard mount:

    sudo mount -t nfs4 -o nfsvers=4.1,_netdev,hard,intr ...
    
  2. Increase NFS timeout values:

    sudo mount -t nfs4 -o nfsvers=4.1,_netdev,hard,timeo=600,retrans=3 ...
    
  3. Reduce concurrent I/O operations

Limit issues

"Connection refused" or "Too many open connections"

If you hit connection limits:

  1. Check current connection count via metrics
  2. Reduce concurrent mounts if possible
  3. Use connection pooling with nconnect mount option
  4. Contact support to discuss increasing connection limits

Resolving slow directory listings

If directory operations are slow:

  1. Check directory file count: ls -1 | wc -l
  2. If >10,000 files, consider reorganizing into subdirectories
  3. Review metrics for cache hit rates

Resolving filename length errors

If you encounter filename length errors:

  1. Check filename length: echo "filename" | wc -c
  2. Ensure total path length stays under 4,096 bytes
  3. Remember: 255 bytes is the maximum filename length (not characters, depending on encoding)

Getting help

If you've tried the solutions above and still have issues:

  1. Gather diagnostic information:

    # File Store status
    evroc storage filestore get my-filestore -o yaml
    
    # Mount information
    mount | grep filestore
    
    # NFS statistics
    nfsstat
    
    # Network connectivity (check if mount is accessible)
    ls -la /mnt/filestore
    
  2. Contact support with:

    • File Store ID
    • VM IDs experiencing issues
    • Error messages
    • Diagnostic output

See Also