How can I use docker without sudo?

Question

Tarun

Asked: 2014-02-18 06:54:07 +0800 CST2014-02-18 06:54:07 +0800 CST 2014-02-18 06:54:07 +0800 CST

How to get the URL from a file using a shell script

772

I have a file which consists of a URL. I'm trying to get the URL from that file using a shell script.

In the file, the URL is like this:

('URL', 'http://url.com');

I tried to use the following:

cat file.php | grep 'URL' | awk '{ print $2 }'

It gives the output as:

'http://url.com');

But I need to get only url.com in a variable inside the shell script. How can I accomplish this?

7 Answers

Voted

terdon · Answer 1 · 2014-02-18T09:35:52+08:00

terdon

2014-02-18T09:35:52+08:002014-02-18T09:35:52+08:00

You can do everything with a simple grep:

grep -oP "http://\K[^']+" file.php

From man grep:

   -P, --perl-regexp
          Interpret  PATTERN  as  a  Perl  regular  expression  (PCRE, see
          below).  This is highly experimental and grep  -P  may  warn  of
          unimplemented features.
   -o, --only-matching
          Print  only  the  matched  (non-empty) parts of a matching line,
          with each such part on a separate output line.

The trick is to use \K which, in Perl regex, means discard everything matched to the left of the \K. So, the regular expression looks for strings starting with http:// (which is then discarded because of the \K) followed by as many non-' characters as possible. Combined with -o, this means that only the URL will be printed.

You could also do it in Perl directly:

perl -ne "print if s/.*http:\/\/(.+)\'.*/\$1/" file.php\

14

Frantique · Answer 2 · 2014-02-18T06:58:38+08:00

Best Answer

Frantique

2014-02-18T06:58:38+08:002014-02-18T06:58:38+08:00

Something like this?

grep 'URL' file.php | rev | cut -d "'" -f 2 | rev

or

grep 'URL' file.php | cut -d "'" -f 4 | sed s/'http:\/\/'/''/g

To strip out http://.

11

sourav c. · Answer 3 · 2014-02-18T07:02:10+08:00

sourav c.

2014-02-18T07:02:10+08:002014-02-18T07:02:10+08:00

Try this,

awk -F// '{print $2}' file.php | cut -d "'" -f 1

5

AsymLabs · Answer 4 · 2014-02-18T14:30:29+08:00

Revisiting this again, and trying to use nothing but a Bash shell, another one line solution is:

while read url; do url="${url##*/}" && echo "${url%%\'*}"; done < file.in > file.out

Where file.in contains the 'dirty' url list and file.out will contain the 'clean' URL list. There are no external dependencies and there is no need to spawn any new processes or subshells. The original explanation and a more flexible script follows. There is a good summary of the method here, see example 10-10. This is pattern based parameter substitution in Bash.

Expanding on the idea:

src="define('URL', 'http://url.com');"
src="${src##*/}"        # remove the longest string before and including /
echo "${src%%\'*}"      # remove the longest string after and including '

Result:

url.com

No need to call any external programs. Furthermore, the following bash script, get_urls.sh, permits you to read a file directly or from stdin:

#!/usr/bin/env bash

# usage: 
#     ./get_urls.sh 'file.in'
#     grep 'URL' 'file.in' | ./get_urls.sh

# assumptions: 
#     there is not more than one url per line of text.
#     the url of interest is a simple one.

# begin get_urls.sh

# get_url 'string'
function get_url(){
  local src="$1"
  src="${src##*/}"        # remove the longest string before and including /
  echo "${src%%\'*}"      # remove the longest string after and including '
}

# read each line.
while read line
do
  echo "$(get_url "$line")"
done < "${1:-/proc/${$}/fd/0}"

# end get_urls.sh

Florian Diesch · Answer 5 · 2014-02-18T07:12:23+08:00

Florian Diesch

2014-02-18T07:12:23+08:002014-02-18T07:12:23+08:00

If all the lines contain a URL:

awk -F"'|http://" '{print $5}' file.php

If only some lines contain a URL:

awk -F"'|http://" '/^define/ {print $5}' file.php

Depending on the other lines you may need to change the ^define regex

3

Sammitch · Answer 6 · 2014-02-18T13:19:44+08:00

Sammitch

2014-02-18T13:19:44+08:002014-02-18T13:19:44+08:00

Simple:

php -r 'include("file.php"); echo URL;'

and if you need to remove the 'http://', then:

php -r 'include("file.php"); echo URL;' | sed 's!^http://\(.*\)!\1!'

So:

myURL=$(php -r 'include("file.php"); echo URL;' | sed 's!^http://\(.*\)!\1!')

If you need a certain part of the URL you need to refine your terminology, a URL is all of the following, sometimes more:

URL := protocol://FQDN[/path][?arguments]

FQDN := [hostname.]domain.tld

0

user509619 · Answer 7 · 2016-02-22T08:31:04+08:00

user509619

2016-02-22T08:31:04+08:002016-02-22T08:31:04+08:00

for me, the other grep answers given return string information after the link.

This worked for me to only pull out the url:

egrep -o "(http(s)?://){1}[^'\"]+"

0

How to get the URL from a file using a shell script

How to install Google Chrome

Is there a command to list all users? Also to add, delete, modify users, in the terminal?

How to delete a non-empty directory in Terminal?

How to unzip a zip file from the Terminal?

How can I copy the contents of a folder to another folder in a different directory using terminal?

How do I install a .deb file via the command line?

How do I run .sh scripts?

How do I install a .tar.gz (or .tar.bz2) file?

How to list all installed packages

Unable to lock the administration directory (/var/lib/dpkg/) is another process using it?