Read in Array elements, Compare to Word Document, Add & Count to CSV File

Welcome Forums General PowerShell Q&A Read in Array elements, Compare to Word Document, Add & Count to CSV File

This topic contains 14 replies, has 2 voices, and was last updated by

 
Participant
2 weeks, 3 days ago.

  • Author
    Posts
  • #172270

    Participant
    Topics: 7
    Replies: 17
    Points: 40
    Rank: Member

    I need help modifying this script.

    It does a lot and maybe it can be re-written to be more efficient.

    Currently, this script has an array of doctor names.  When a new doctor comes on board, the name is added to this array.

    The script looks in a folder on the server and reads in word documents.

    The script looks for the line that starts with "Dictated by:" and then it finds and compares the doctor name on the report and the array.  If found, it counts the occurrence and adds to a csv file.

    Then the script emails the csv as an attachment.

    There is one doctor name that is non-standard and coming over in an unknown format:  YAP, FELIX Y.

    I want the script to accept this name and continue.  I want the script to count it on the csv.  Also how can I get the outputted csv file to total each column and display the column totals?  I'm scratching my head on this.

    Help!  Thanks.

     

  • #172297

    Participant
    Topics: 0
    Replies: 100
    Points: 363
    Helping Hand
    Rank: Contributor

    Hello Gina,

    Can you give an example of the non-standard format issue you are facing with Yap, Felix Y?

  • #172996

    Participant
    Topics: 7
    Replies: 17
    Points: 40
    Rank: Member

    Hello.  In my doctor name array, I have John X. Smith, etc.  There are two names that do not come over in this format.  We have YAP, FELIX Y.   and we have *Mark Ziemba.   I have a "exceptions" hash or a "synomym hash table but for some reason the script is still not recognizing YAP, FELIX Y.

    # hash table of alternative names in format @{'alias' = 'standard name', 'alias' = 'standard name'}

    $docSynonyms = @{'YAP, FELIX Y.' = 'Felix Y. Yap'
    '*Mark Ziemba' = 'Mark Ziemba'}

    How do I get the script to work with these exceptions?  How do I get it so that if I have other exceptions, it will see this hash table and carry on successfully with what I set in the hash table?

    Also how can I get it to add up columns and put a total at the bottom of each column?  Thank you!

     

  • #173017

    Participant
    Topics: 0
    Replies: 100
    Points: 363
    Helping Hand
    Rank: Contributor

    Hello Gina,

    I'm analyzing the code in further, you stated you are using Synonyms for those doctors name. At what point in the code are you pulling the doctor's name that needs to be converted?

    I'm trying to understand why this was done?

    $Match = Select-String -Path $document -Pattern '^*Dictated\s+by\s*:?\s*(.*?)\s*\Z' | Select-Object -Last 1 -ExpandProperty Matches

    Why not just use the following?

    Select-String -Pattern "Dictabed by"
  • #173389

    Participant
    Topics: 7
    Replies: 17
    Points: 40
    Rank: Member

    Hi Jason.  There is now a portion of code that sits beneath the doclist array. Also, the name of the doctor varies and it follows the "Dictated by:" text.  So on the bottom of every report you have this line of text in this standard format:
    Dictated by: Timothy L. Davis, M.D. on 8/28/2019 at 7:11

    There are two cases where I know of right now that the doctor name is NOT in standard format:
    Dictated by: YAP, FELIX Y., M.D. on x/xx/xxxx at hh:mm
    AND
    Dictated by: *Mark Ziemba on x/xx/xx at hh.mm

     

     # hash table of alternative names in format @{'alias' = 'standard name', 'alias' = 'standard name'}
    $docSynonyms = @{'YAP, FELIX Y.' = 'Felix Y. Yap'}
    
    $doc = @{}
    
    try {
    # predefine doctor structure
    foreach ($name in $docList)
    {
    $doc[$name] = @{}
    foreach ($site in $sites) { $doc[$name] += @{ $site = @() } }
    }
    
    # process documents based on site
    foreach ($site in $sites)
    {
    Get-ChildItem ($(if ($site -eq 'rdc') { "D:\reports\dr_reports" } else { "D:\ininbound\$site" }) + "\archive\$date") |
    % {
    $document = $_.FullName
    if ($Match = Select-String -Path $document -Pattern '^*Dictated\s+by\`*:\s*\*?(.*?),\s' | Select-Object -Last 1 -ExpandProperty Matches)
    {
    $content = $Match.Groups[1].Value
    Write-Host "`r`n"
    Write-Host $document -ForegroundColor Cyan
    Write-Host $content
    
    if ($content -in $docSynonyms) { $content = $docSynonyms[$content] }
    if ($doc[$content])
    {
    $doc[$content].$site += $document
    Write-Host "$document matched $content" -ForegroundColor Yellow
    Add-Content -path $doctorLog -value "$content found on $document"
    } else {
    Write-Host -f red "$content not in list of doctors"
    }
    } else {
    # String not found

     

  • #173398

    Participant
    Topics: 0
    Replies: 100
    Points: 363
    Helping Hand
    Rank: Contributor

    Gina,

    I'm under the impression the $DocList is was originally the $Doctors array you created? Just want to get some verification.

  • #173413

    Participant
    Topics: 0
    Replies: 100
    Points: 363
    Helping Hand
    Rank: Contributor

    Hey Gina,

    Okay so after parsing your code and revamping some of it to make more sense to me, I've found what is happening.  The regular expression you have is stopping at the first comma found. Below is an example of the output. This is what you need to fix to finish pulling all of the data. All I did was create 2 files with varying data you had created.

    PS C:\Folder> Get-Content File*.txt
    Dictated by: Timothy L. Davis, M.D. on 8/28/2019 at 7:11
    Dictated by: YAP, FELIX Y., M.D on 8/28/2019 at 7:11
    PS C:\Folder> (Get-ChildItem | Select-String -Pattern '^*Dictated\s+by\`*:\s*\*?(.*?),\s' | Select-Object -ExpandProperty Matches)
    
    Groups : {0, 1}
    Success : True
    Name : 0
    Captures : {0}
    Index : 0
    Length : 31
    Value : Dictated by: Timothy L. Davis,
    
    Groups : {0, 1}
    Success : True
    Name : 0
    Captures : {0}
    Index : 0
    Length : 18
    Value : Dictated by: YAP,
  • #173452

    Participant
    Topics: 0
    Replies: 100
    Points: 363
    Helping Hand
    Rank: Contributor

    Gina,

    Great news, spoke with a colleague who uses Regex more than me and we worked it out. Try this Regex now.

    '^Dictated\s+by*:\s\*?(.*?)*(?=,\sM.D)'

    or

    .+?(?=,\sM.D)

    (Get-childitem .\ | Select-String -Pattern '^Dictated\s+by*:\s\*?(.*?)*(?=,\sM.D)' | Select-object -expand matches).Groups.Value
    Dictated by: Timothy L. Davis
    
    Dictated by: YAP, FELIX Y.
    PS C:\Folder> (Get-childitem .\ | Select-String -Pattern '.+?(?=,\sM.D)' | Select-object -expand matches).Groups.Value
    Dictated by: Timothy L. Davis
    Dictated by: YAP, FELIX Y.
  • #173584

    Participant
    Topics: 7
    Replies: 17
    Points: 40
    Rank: Member

    Awesome.  Thank you!  Sorry for the delayed response.  Big projects in the mix.  I will play with this code and let ya know what happens!  Thank you so much.  This is great. 🙂

  • #173827

    Participant
    Topics: 7
    Replies: 17
    Points: 40
    Rank: Member

    I must be typing in something wrong.  This isn't working for me.  I get an error.  Here's the code I'm trying and I'll try and error below too.

    
    foreach ($site in $sites)
    {
    Get-ChildItem ($(if ($site -eq 'rdc') { "D:\reports\dr_reports" } else { "D:\ininbound\$site" }) + "\archive\$date") |
    % {
    $document = $_.FullName
    if ($Match = Select-String -Path $document -Pattern '^Dictated\s+by*:\s\*?(.*?)*(?=,\sM.D)' | Select-Object -Last 1 -ExpandProperty Matches)
    {
    $content = $Match.Groups[1].Value
    
    

    ERROR I receive:

    D:\reports\dr_reports\archive\2019-08-29\1102211.doc
    not in list of doctors
    D:\reports\dr_reports\archive\2019-08-29\20195LD9VUVY9.doc
    not in list of doctors

    Get-Content : Cannot find path 'D:\script\Isaacs-Scripts\scripts\get-doctorcount\logs\08-29-2019.02.25.txt' because it does not exist.
    At D:\script\Isaacs-Scripts\scripts\get-doctorcount\TESTING-082919.ps1:113 char:22
    + $sortedContent = Get-Content $doctorLog | Sort-Object
    + ~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : ObjectNotFound: (D:\script\Isaac...-2019.02.25.txt:String) [Get-Content], ItemNotFoundException
    + FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetContentCommand

     

  • #173833

    Participant
    Topics: 0
    Replies: 100
    Points: 363
    Helping Hand
    Rank: Contributor

    Gina,

    The issue is because the item in question doesn't exist. Its not an issue with the Regex.

  • #174307

    Participant
    Topics: 7
    Replies: 17
    Points: 40
    Rank: Member

    Good morning. Thanks for helping with the names and REGEX. I'm going to let that go for now because it's more complex than I thought. I'm not sure how to make the script accept a name that isn't in standard format and add it to the running total. It's a complicated task.
    Lastly though, is there any way to total each column? Right now the script does a grand total on the far left. I can't figure out how to get each column to total.

  • #174313

    Participant
    Topics: 0
    Replies: 100
    Points: 363
    Helping Hand
    Rank: Contributor

    Hello Gina,

    Please provide a copy of the updated script for review. With so many pieces having been worked on I'm not sure what you are using and not at the moment. If you are adding each of the values into its own array you could do a count on the property. I.e.

    $Object.Doctorname.Count
  • #174349

    Participant
    Topics: 7
    Replies: 17
    Points: 40
    Rank: Member

    Yes, I agree.  Lots of tweeking going on.  Here's the latest.

    
    $sites = 'cdc','fh','mrmc','svh','scdc','rdc'
    
    $date = (Get-Date(Get-Date).AddDays(-1) -Format 'yyyy-MM-dd')
    $dateTime = (Get-Date(Get-Date).AddDays(-1) -Format 'MM-dd-yyyy.hh.mm')
    
    $csvPath = "D:\script\Isaacs-Scripts\scripts\get-doctorcount\CSVs\$dateTime.csv"
    $errorLog = "D:\script\Isaacs-Scripts\scripts\get-doctorcount\errorlogs\$dateTime.html"
    $errordoc = "D:\script\Isaacs-Scripts\scripts\get-doctorcount\errorlogs\$dateTime-doctor-error.txt"
    $doctorLog = "D:\script\Isaacs-Scripts\scripts\get-doctorcount\logs\$dateTime.txt"
    
    # array of all doctor names
    # to add to it, simply add quotes and a comma (unless its the last in the array)
    $docList = @(
    'Benjamin T. Addicott',
    'Timothy L. Auran',
    'Eugene F. Barasch',
    'William C. Burnette',
    'Kevin R. Connolly',
    'Harry F. Corbett',
    'Timothy L. Davis',
    'Arthur C. Duberg',
    'Stephen R. Holtzman',
    'Taylor R. Jordan',
    'Thomas L. Miller',
    'Linda D. Mulder',
    'Ragu C. Nathan',
    'Erik M. Olson',
    'Jaywant P. Parmar',
    'Mohsin M. Rahman',
    'David A. Rigual',
    'William M. Russell',
    'Farhad W. Sani',
    'Melissa E. Sims',
    'Laura E. Traube',
    'David J. Tuttle',
    'Erica Vergara',
    'Donna E. Winingham',
    'Felix Y. Yap',
    'Mark Ziemba',
    "Teresa O'Neill",
    "Sonja M. O'Brien"
    )
    
    # hash table of alternative names in format @{'alias' = 'standard name', 'alias' = 'standard name'}
    # Would like some way of capturing the KNOWN maflormed names so that the script continues.  Take that malformed name and add it to the running total for the doctor and for the site.
    $docSynonyms = @{'YAP, FELIX Y.' = 'Felix Y. Yap'}
    
    $doc = @{}
    
    try {
    # predefine doctor structure
    foreach ($name in $docList)
    {
    $doc[$name] = @{}
    foreach ($site in $sites) { $doc[$name] += @{ $site = @() } }
    }
    
    # process documents based on site
    foreach ($site in $sites)
    {
    Get-ChildItem ($(if ($site -eq 'rdc') { "D:\reports\dr_reports" } else { "D:\ininbound\$site" }) + "\archive\$date") |
    % {
    $document = $_.FullName
    if ($Match = Select-String -Path $document -Pattern '^*Dictated\s+by\`*:\s*\*?(.*?),\s' | Select-Object -Last 1 -ExpandProperty Matches)
    
    {
    $content = $Match.Groups[1].Value
    Write-Host "`r`n"
    Write-Host $document -ForegroundColor Cyan
    Write-Host $content
    
    if ($content -in $docSynonyms) { $content = $docSynonyms[$content] }
    if ($doc[$content])
    {
    $doc[$content].$site += $document
    Write-Host "$document matched $content" -ForegroundColor Yellow
    Add-Content -path $doctorLog -value "$content found on $document"
    } else {
    Write-Host -f red "$content not in list of doctors"
    Add-Content $errordoc -Value "$content not in the list of doctors"
    }
    } else {
    # String not found
    }
    }
    }
    
    Set-Content $csvPath -Value $dateTime
    Add-Content $csvPath -value "DRName,TOTAL,RASLO,AGCH,FH,MRMC,SVH,SCDC"
    
    $completeTotal = 0
    foreach ($doctor in $docList)
    {
    $curdoc = $Doc[$doctor]
    $total = ($curdoc[$sites] | % { $_ }).Count
    $completeTotal += $total
    $firstname = $doctor.Split(' ')
    $firstname = $firstname[0] + $firstname[-1][0]
    
    Add-Content $csvPath -Value "$firstname,$total,$($curdoc.rdc.count),$($curdoc.cdc.count),$($curdoc.fh.count),$($curdoc.mrmc.count),$($curdoc.svh.count),$($curdoc.scdc.count)"
    }
    
    Add-Content $csvPath -Value "Total Doctor Count,$completeTotal"
    Send-MailMessage -From 'email' -to 'email' -Subject "Radiologist Daily Read Count for $date" -Body "Radiologist counts for $date" -SmtpServer 'aspmx.l.google.com' -port '25' -Attachments $csvPath
    Send-MailMessage -From 'email' -to 'email' -Subject "Error – NON Rads for $date" -Body "Error – Rads not in rad array for $date" -SmtpServer 'aspmx.l.google.com' -port '25' -Attachments $errordoc
    
    $sortedContent = Get-Content $doctorLog | Sort-Object
    Set-Content -path $doctorLog -Value $sortedContent
    
    } catch {
    Add-Content $errorlog -Value "Error Content -

    Exception: $($_.Exception)

    CategoryInfo: $($_.CategoryInfo)

    StackLocation: $($_.ScriptStackTrace)
    " Send-MailMessage -From 'email' -to 'email' -Subject 'Error found with Doctor Count:' -Body "Error Content -

    Exception: $($_.Exception)

    CategoryInfo: $($_.CategoryInfo)

    StackLocation: $($_.ScriptStackTrace)
    " -SmtpServer 'aspmx.l.google.com' -port '25' -BodyAsHtml }

     

  • #174409

    Participant
    Topics: 0
    Replies: 100
    Points: 363
    Helping Hand
    Rank: Contributor

    Gina,

    I've updated your code, let me know if this still works. Once we verify this we can start on the next piece. I highly recommend you stop using one liners and aliases in your script. This is not best practice when creating a script.

You must be logged in to reply to this topic.