1

I am pulling a string from a text file that looks like:

C:\Users\users\Documents\Firefox\tools\Install.ps1:37:    Url = "https://somewebsite.com"

I need to some how remove everything except the URL, so it should look like:

https://www.somewebsite.com

Here is what I have tried:

$Urlselect = Select-String -Path "$zipPath\tools\chocolateyInstall.ps1"  -Pattern "url","Url"-List # Selects URL download path
$Urlselect = $Urlselect -replace ".*" ","" -replace ""*.","" # remove everything but the download link

but this didn't seam to do anything. I am thinking that its going to have to do with regex but I am not sure how to put it. Any help is appreciated. Thanks

mklement0
  • 382,024
  • 64
  • 607
  • 775

1 Answers1

1

I suggest using the switch statement with the -Regex and -File options:

$url = switch -regex -file "$zipPath\tools\chocolateyInstall.ps1" { 
  ' Url = "(.*?)"' { $Matches[1]; break } 
}
  • -file makes switch loop over all lines of the specified file.
  • -regex interprets the branch conditionals as regular expressions, and the automatic $Matches variable can be used in the associated script block ({ ... }) to access the results of the match, notably, what the 1st (and only) capture group in the regex ((...)) captured - the URL of interest.
  • break stops processing once the 1st match is found. (To continue matching, use continue).

If you do want to use Select-String:

$url = Select-String -List ' Url = "(.*?)"' "$zipPath\tools\chocolateyInstall.ps1" |
  ForEach-Object { $_.Matches.Groups[1].Value }

Note that the switch solution will perform much better.


As for what you tried:

Select-String -Path "$zipPath\tools\chocolateyInstall.ps1" -Pattern "url","Url"

Select-String is case-insensitive by default, so there's no need to specify case variations of the same string. (Conversely, you must use the -CaseSensitive switch to force case-sensitive matching).

Also note that Select-String doesn't output the matching line directly, as a string, but as a match-information objects; to get the matching line, access the .Line property[1].

$Urlselect -replace ".*" ","" -replace ""*.",""

".*" " and ""*." result in syntax errors, because you forgot to escape the _embedded " as `".

Alternatively, use '...' (single-quoted literal strings), which allows you to embed " as-is and is generally preferable for regexes and replacement operands, because there's no confusion over what parts PowerShell may interpret up front (string expansion).

Even with the escaping problem solved, however, your -replace operations wouldn't have worked, because .*" matches greedily and therefore up to the last "; here's a corrected solution with non-greedy matching, and with the replacement operand omitted (which makes it default to the empty string):

PS> 'C:\...ps1:37: Url = "https://somewebsite.com"' -replace '^.*?"' -replace '"$'
https://somewebsite.com
  • ^.*?" non-greedily replaces everything up to the first ".
  • "$ replaces a " at the end of the string.

However, you can do it with a single -replace operation, using the same regex as with the switch solution at the top:

PS> 'C:\...ps1:37: Url = "https://somewebsite.com"' -replace '^.*?"(.*?)"', '$1'
https://somewebsite.com

$1 in the replacement operand refers to what the 1st capture group ((...)) captured, i.e. the bare URL; for more information, see this answer.


[1] Note that there's a green-lit feature suggestion - not yet implemented as of Windows PowerShell Core 6.2.0 - to allow Select-String to emit strings directly, using the proposed -Raw switch - see https://github.com/PowerShell/PowerShell/issues/7713

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • I'm glad to hear it was helpful, @revgirl2012; my pleasure. Good luck; I hope you'll learn to enjoy PowerShell; it may take a while, but it's worth it. – mklement0 Feb 28 '19 at 15:50