Removing Quotes from all CSV Files in a Directory

Managing CSV files often involves cleaning up data before processing. A common task is removing unwanted quotation marks from the contents of CSV files.

I wrote a blog post on how to remove quotes from a single CSV file using PowerShell some years ago, which I have just updated. At the end, I had the idea of adding a follow-up which is to do the same, remove quotes in a CSV file, but for multiple files.

Ideally you should be cleaning the data at source rather than performing this task. There’s many ways to do this too, I think you should use the one you find the easiest that saves most of your time.

PowerShell Script to Remove Quotes from Multiple CSV Files

Here is a script to remove quotes from all CSV files in a specified directory and save the cleaned versions in a new output directory:

# Define the input directory and output directory
$InputDirectory = "C:\Path\To\Input\CSVFiles"
$OutputDirectory = "C:\Path\To\Output\CleanedCSVFiles"

# Create the output directory if it doesn't exist
if (-not (Test-Path $OutputDirectory)) {
    New-Item -ItemType Directory -Path $OutputDirectory
}

# Get all CSV files in the input directory
$CSVFiles = Get-ChildItem -Path $InputDirectory -Filter "*.csv"

foreach ($File in $CSVFiles) {
    # Read the content of the current CSV file, remove quotes, and save it to a new file
    $InputFilePath = $File.FullName
    $OutputFilePath = Join-Path -Path $OutputDirectory -ChildPath $File.Name

    # Process the file: Remove quotes and output to the new file
    Import-Csv -Path $InputFilePath |
    ConvertTo-Csv -NoTypeInformation |
    ForEach-Object { $_ -Replace '"', "" } |
    Out-File -FilePath $OutputFilePath -Force -Encoding ascii

    Write-Host "Processed file: $InputFilePath -> $OutputFilePath"
}

Write-Host "All CSV files have been processed and saved in $OutputDirectory."

How it works:
Define Directories:
The script processes files from the InputDirectory and saves cleaned versions in the OutputDirectory. Ensure you update these paths with your actual directories.
Batch Processing: Uses Get-ChildItem to loop through all .csv files in the input directory.
Remove Quotes: Reads each file’s content, removes quotes, and writes the cleaned data to a new file.
Output Directory: Files are saved with their original names in the designated output directory to avoid overwriting the source files.

Recommendations:
Monitor Performance: If you’re working with large CSV files, monitor your system resources during processing.
Backup Original Files: Always create backups of the original files before running bulk operations.
Validate Data: Check the output files to ensure the data integrity is maintained after quote removal.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Recent Posts
Categories
Tags

Always On Availability Groups (AAG) (4) AWS (4) AWS Redshift (6) Certificates & Encryption (3) Change Data Capture (CDC) (2) Database Admin (72) Database Backups & Recovery (14) Database Mirroring (2) Error Messages (5) Git Commands (6) Importing & Exporting Data (2) Linked Servers (3) Linux Admin (2) Logging & Monitoring (1) Measuring Databases (10) Microsoft Patching (2) MySQL (4) Postgres (6) PowerShell Scripts (1) SQL Server Agent (5) SQL Server Database Files (1) SQL Server Data Types (2) SQL Server Management Studio (SSMS) (16) SQL Server Network Connectivity (3) SQL Server on Linux (1) SQL Server Patching (2) SQL Server Performance Tuning (6) SQL Server Processes (SPIDs) (7) SQL Server Replication (2) SQL Server Scripts (13) SQL Server Security (3) Windows Admin (21) Windows Authentication (2) Windows Automation (1) Windows Events (2) Windows Firewall (4) Windows Subsystem for Linux (WSL) (18)