Page 1
Page 1
Started By
Message

Help Excel gurus..need to automatically reformat a text string

Posted on 11/13/14 at 3:51 pm
Posted by CAD703X
Liberty Island
Member since Jul 2008
78106 posts
Posted on 11/13/14 at 3:51 pm
Here's the gist:

Column A cells contains HTML text block.
Within that block is a substring that needs to be reformatted and the results dumped into Column B

quote:

Example of text in column A
< html > blah blah ftp://userA:password@staging.domain.com/path/image1.jpg" blah blah < / html >


I want to grab *JUST* the FTP portion (in bold) and return the results in a string in column B that looks like this:

quote:

http://production.domain.com/userA


The tricky thing is I need to pick out that "userA" (this is actually the same name as the filepath) and add it to the end of the HTTP path

Make sense?

The final step (THIS WOULD HAPPEN LATER) would be to swap the BOLD portion of the example in the first quote with the second quote but that will be done via DB query.

The list in Column B will form the input for the query. The result will ultimately look like this:

quote:

http://production.domain.com/userA/path/image1.jpg


I have 10,000 rows to fix, each one has a different "userA" that needs to be pulled out and reinserted at the end.

Anyone good with string substitutions? TIA
This post was edited on 11/13/14 at 3:57 pm
Posted by gamatt53
Member since Nov 2010
4934 posts
Posted on 11/13/14 at 4:19 pm to
You will need to use the find function to find the starting and ending positions of the text strings you want to islolote (for the ending position add the length of the search string to the find result).

Once you know the start and end positions of the text you want to islotate use a mid function referencing those find functions that give you the starting and ending positins.

You can then concantenate the various texts strings back together after you have isloated them in whatever order you need to.

This post was edited on 11/13/14 at 4:20 pm
Posted by CAD703X
Liberty Island
Member since Jul 2008
78106 posts
Posted on 11/13/14 at 4:21 pm to
i've used FIND and MID to find the substring and pull that substring into a separate cell so far.

now i'm trying to figure out how to chop it up..but i think i see what to do...rinse..lather..repeat...until i have what i need.
Posted by gamatt53
Member since Nov 2010
4934 posts
Posted on 11/13/14 at 4:27 pm to
No need to do anything manual - the right formulas with the correct cell references should give you what you need when you copy down the 10,000 rows.

You know you can build a text string via formula with text strings in quotes or cell references seperated by the & symbol right?

For example lets assume cell a1 has the sting "excel" and cell b1 has string "easy".

=a1&"is"&b2 would give you "excel is easy"



Therefore all you need to do is isolate all your text strings with find and mid then build them back together with a function like the above.
Posted by CAD703X
Liberty Island
Member since Jul 2008
78106 posts
Posted on 11/13/14 at 4:37 pm to
i'm close..ive had to use about 4-5 cells though because the formula is getting crazy..its getting there.
Posted by The 25 Jersey
Tiger Stadium
Member since Oct 2007
1345 posts
Posted on 11/13/14 at 4:39 pm to
Try this out:

=CONCATENATE("ht\tp://",MID(A1,FIND("@",A1) + 1,FIND("/",A1,FIND("@",A1))-FIND("@",A1)),MID(A1,FIND("://",A1) + 3,FIND(":",A1,FIND("://",A1)+3)-FIND("://",A1)-3))

edit: take out the \ in the middle of the http
This post was edited on 11/13/14 at 4:43 pm
Posted by Big Data
Scotch Fan
Member since Nov 2007
2553 posts
Posted on 11/13/14 at 4:43 pm to
For first part:
=MID(LEFT(SUBSTITUTE(A1,":","%",2),FIND("%",A1)-1),FIND("/",A1)+2,LEN(A1))

I couldn't think of a better way to grab the 2nd colon, so I changed it to a %. This may not work depending on your string.
This post was edited on 11/13/14 at 4:44 pm
Posted by The 25 Jersey
Tiger Stadium
Member since Oct 2007
1345 posts
Posted on 11/13/14 at 4:43 pm to
Sorry for the multiple edits, the browser was changing the http into a link. You should be good to go now.
Posted by CAD703X
Liberty Island
Member since Jul 2008
78106 posts
Posted on 11/13/14 at 4:47 pm to


thanks guys. i think i have it..its not as elegant as your solution but i think having the sub strings in different cells is actually helpful when debugging 10,000 rows to make sure the data i'm extracting is consistent.

last question..the text is obviously in excel but all i get when i copy is the formula. thats just a matter of telling excel to paste the 'values' only?
Posted by The 25 Jersey
Tiger Stadium
Member since Oct 2007
1345 posts
Posted on 11/13/14 at 4:48 pm to
Yeah you can paste it into another column as "paste as text", or something similar. Then just copy that wherever you need.
Posted by Iosh
Bureau of Interstellar Immigration
Member since Dec 2012
18941 posts
Posted on 11/13/14 at 4:50 pm to
Do the users and passwords have colons or at-signs in them? If not, you could do that entire thing in a find/replace in Notepad++ with regular expression search mode.

Replace this: ftp://([^:]*):[^@]*@staging.domain.com

With this: http://production.domain.com/\1

Then use whatever to cut off the start and finish.
This post was edited on 11/13/14 at 4:53 pm
Posted by CAD703X
Liberty Island
Member since Jul 2008
78106 posts
Posted on 11/13/14 at 5:11 pm to
i love notepad++, didn't think about using that but i'm 99% there in excel now and i like having the pieces in multiple columns for sorting purposes.

last question; i found 2 substrings:

"blahaaa.domain.com" and "blahbbb.domain.com"

how can I use one expression to ignore 'aaa' and 'bbb'?

i tried "blah*.domain.com" and "blah***.domain.com" but those aren't working.

This post was edited on 11/13/14 at 5:12 pm
first pageprev pagePage 1 of 1Next pagelast page
refresh

Back to top
logoFollow TigerDroppings for LSU Football News
Follow us on Twitter, Facebook and Instagram to get the latest updates on LSU Football and Recruiting.

FacebookTwitterInstagram