how to split txt files in same size

Welcome Forums Pester how to split txt files in same size

Viewing 2 reply threads
  • Author
    Posts
    • #102206
      Participant
      Topics: 1
      Replies: 0
      Points: 0
      Rank: Member

      Hi All,

      We have requirement to split 1 gb txt file in multiple

      each files should split to 300 mb
      1. 1st file 300 mb
      2. 2nd file 300 mb
      3. 3rd file 300 mb
      4 4th file 100 mb

      Below is the script we have used to split files into multipe each 300 mb, Script is spliting file , but some files are spliting in 250 mb and some files 300mb

      Can anyone help me with above requirement to split in same size for first 3 files.

      Script

      #split test
      $sw = new-object System.Diagnostics.Stopwatch
      $sw.Start()
      $filename = “E:\So\bkp.txt”
      $rootName = “E:\Ta\”
      $ext = “.txt”

      $linesperFile = 300000#3000k
      $filecount = 1
      $reader = $null
      try{
      $reader = [io.file]::OpenText($filename)
      try{
      “Creating file number $filecount”
      $writer = [io.file]::CreateText(“{0}{1}.{2}” -f ($rootName,$filecount.ToString(“000”),$ext))
      $filecount++
      $linecount = 0

      while($reader.EndOfStream -ne $true) {
      “Reading $linesperFile”
      while( ($linecount -lt $linesperFile) -and ($reader.EndOfStream -ne $true)){
      $writer.WriteLine($reader.ReadLine());
      $linecount++
      }

      if($reader.EndOfStream -ne $true) {
      “Closing file”
      $writer.Dispose();

      “Creating file number $filecount”
      $writer = [io.file]::CreateText(“{0}{1}.{2}” -f ($rootName,$filecount.ToString(“000”),$ext))
      $filecount++
      $linecount = 0
      }
      }
      } finally {
      $writer.Dispose();
      }
      } finally {
      $reader.Dispose();
      }
      $sw.Stop();

      Write-Host “Split complete in ” $sw.Elapsed.TotalSeconds “seconds”

      thanks in advance

    • #102433
      Participant
      Topics: 1
      Replies: 14
      Points: 0
      Rank: Member

      Since you want to split the files on size I believe you should have the code look at the bytes copied rather than the text line count copied. Here is an example, you can save it to a file (eg: Out-FileChunks.ps1) and then do

      . .\Out-FileChunks.ps1 -Path “SourceFile” -OutputPath “C:\SomeOutputFolder” -ChunkSizeBytes 300MB

      [CmdletBinding()]
      PARAM 
      (
          [Parameter(Mandatory=$true, ValueFromPipeline=$true)]
          [string] $Path,
      
          [Parameter(Mandatory=$true)]
          [string] $OutputPath,
      
          [Parameter()]
          [int] $ChunkSizeBytes = 300MB
      )
      
      $bufferSize = 24 * 1024;
      $buffer = [System.Byte[]]::CreateInstance([System.Byte], $bufferSize)
      
      # Create the output folder if it doesn't exist.
      if ( -not (Test-Path $OutputPath))
      {
          $null = New-Item -Path $OutputPath -ItemType Directory -Force
      }
      
      $fileExtension = [System.IO.Path]::GetExtension($Path)
      $fileNameRoot = $Path | Split-Path -Leaf
      $outputFileNameRoot = [System.IO.Path]::GetFileNameWithoutExtension($fileNameRoot)
      
      try
      {
          $inputStream = [System.IO.File]::OpenRead($Path)
          $fileCount = 0;
      
          # Loop through the entire file.
          while ($inputStream.Position -lt $inputStream.Length)
          {
              $outputFileName = [string]::Format("{0}{1}{2}", $outputFileNameRoot, $fileCount.ToString("000"), $fileExtension)
              $outputFilePath = Join-Path $OutputPath $outputFileName
      
              Write-Progress "Writing file chunk $outputFilePath"
              # Create ouptut files up to the splitSize.
              $outputStream = [System.IO.File]::Create($outputFilePath)
                  
              $chunkBytesRemaining = $ChunkSizeBytes
      
              while ($chunkBytesRemaining -gt 0)
              {
                  $bytesRead = $inputStream.Read($buffer, 0, [System.Math]::Min($chunkBytesRemaining, $bufferSize))
      
                  if ( $bytesRead -le 0 )
                  {
                      # nothing left to read so done writing all chunks.
                      break;
                  }
      
                  $outputStream.Write($buffer, 0, $bytesRead);
                  $chunkBytesRemaining -= $bytesRead;
              }
                  
              $outputStream.Dispose();
              $outputStream = $null;
      
              Write-Progress "Completed writing file chunk $outputFilePath"
              $fileCount++;
          }
      }
      finally
      {
          if ( $inputStream -ne $null )
          {
              $inputStream.Dispose();
              $inputStream = $null;
          }
      
          if ( $outputStream -ne $null )
          {
              $outputStream.Dispose();
              $outputStream = $null;
          }
      }
      
    • #102640
      Participant
      Topics: 0
      Replies: 4
      Points: 0
      Rank: Member

      You can split text files into smaller multiple text file using command line:
      split -l 5000 -d –additional-suffix=.txt $FileName file
      -l 5000: split file into files of 5,000 lines each.
      -d: numerical suffix. This will make the suffix go from 00 to 99 by default instead of aa to zz.
      –additional-suffix: lets you specify the suffix, here the extension
      $FileName: name of the file to be split.
      file: prefix to add to the resulting files.
      As always, check out man split for more details.

      For Mac, the default version of split is apparently dumbed down. You can install the GNU version using the following command.
      brew install coreutils
      and then you can run the above command by replacing split with gsplit. Check out man gsplit for details.

Viewing 2 reply threads
  • The topic ‘how to split txt files in same size’ is closed to new replies.