The best place to *find* answers to programming/development questions, imo, however it's the *worst* place to *ask* questions (if your first question/comment doesn't get any up-rating/response, then u can't ask anymore questions--ridiculously unrealistic), but again, a great reference for *finding* answers.

My Music (Nickleus)

20120921

regexp examples for eclipse that match specific html/jsf tag and attribute names and also parsing java code

been working for hours on a regex to match all instances of a specific jsf tag name with a specific attribute name, practically getting an ulcer, and i finally figured out a simple solution:
TAG_NAME[^>]+ATTRIBUTE_NAME\s*=

if TAG_NAME == convertDateTime
if ATTRIBUTE_NAME == locale

then it will match e.g.:
<f:convertDateTime type="both" dateStyle="short"
locale=
"en" timeZone="Europe/Oslo" />

and

<f:convertDateTime locale="#{myBean.localeString}" type="date" dateStyle="medium" timeZone="Europe/Oslo" />



[^>]
this means no tag-close character, because we don't want matches like this:
<f:convertDateTime type="both" dateStyle="short" timeZone="Europe/Oslo" />
<rich:calendar locale="en" datePattern="dd.MM.yyyy"...

we want to stop the searching/matching at the end of the f:convertDateTime tag:
<f:convertDateTime type="both" dateStyle="short" timeZone="Europe/Oslo" />
<rich:calendar locale="en" datePattern="dd.MM.yyyy"...


\s*=
this means that 0 or more whitespace can be between the ATTRIBUTE_NAME and the equals sign. e.g. this would also get matched:
<f:convertDateTime locale   ="#{myBean.localeString}" type="date" dateStyle="medium" timeZone="Europe/Oslo" />


i couldn't find this solution anywhere and was surprised to find so few asking for it, but lots of people writing things like "you can't use regular expressions to parse html" (a post on Stack Overflow that got 4432 positive votes!).

here's a regexp that works in eclipse for finding a tag WITHOUT a specific attribute (i've tested it):
TAGNAME(?:\s+(?!ATTRIBUTENAME\b)[\w\-.:]+(?:\s*=\s*(?:"[^"]*"|'[^']*'|[\w\-.:]+))?)*\s*/?>

#######

here's an example of regexp search and replace in eclipse that uses grouping and back references for rich:calendar tags with locale attribute that either have a normal hardcoded text string value (e.g. "en") OR a backing/managed bean expression value (e.g. "#{someBean.someValue}") OR empty value (""):
(rich:calendar[^>]*locale\s*=\s*")[^"]*
replace with:
\1#\{myBean\.usersLocale\}

this inserts the Locale value returned from MyBean.getUsersLocale()


[^"]*
this means to match everything (0 matches to infinity) that isn't a ", i.e. stop matching once you reach the locale attribute's closing quote, ", e.g.:
<rich:calendar value="#{myBean.toDate}"  requiredMessage="required"
locale="no/NO"
datePattern="dd.MM.yyyy"
required="true"
/>
so the whole regex would match (everything in bold) e.g.:
<rich:calendar value="#{myBean.toDate}"  requiredMessage="required"
locale="
no/NO
"
datePattern="dd.MM.yyyy"
required="true"
/>
and the text with green background denotes the first back reference (\1) group, i.e.:
(rich:calendar[^>]*locale\s*=\s*")[^"]*
the enclosing parentheses ( and ) denote a grouping, so everything inside those parentheses that matches gets saved and pasted back into the result, using a backreference, e.g.: \1
for the first back reference, \2 for the second back reference/grouping, etc, e.g.:
(text-grouping-to-save-1)text-to-replace(text-grouping-to-save-2)
replace with:
\1new-text-to-replace-old-text\2

which would result in:
text-grouping-to-save-1new-text-to-replace-old-texttext-grouping-to-save-2

##############

here's one that finds all elements with a locale attribute that doesn't have the value "#{myBean.usersLocale}":
locale\s*=\s*"(?!#\{myBean\.usersLocale\})[^"]*

(?!XXX)
means match text that ISN'T equal to XXX, and don't create a back reference for it--denoted by the question mark, ?

(?:XXX)
would mean: match text that IS equal to XXX, and don't create a back reference for it


###############

someone had coded f:converter tags with a locale attribute, e.g.:
<f:converter locale="#{myBean.localeString}"
converterId="CustomIntConverter" />

so i needed a regex to delete all these locale attributes from all f:converter tags because locale isn't supported in this tag--not in jsf 1.2, 2.0 or 2.1! (however it is supported for related tags like f:convertDateTime and f:convertNumber)

here's the regex search:
(f:converter\s+[^>]*)locale\s*=\s*"[^"]*"

and replace with:
\1


i.e. keep everything that matched up until the locale attribute, and delete the locale attribute and its quoted value.



################

my next problem was finding all jsf tags that use some pageProps OR pageProperties class objects to get a property value and replace all references to such java classes with a standard resource bundle variable, msg, while at the same time keeping and reusing the properties label string value--yes, all in one search and replace!

so, i have code lines like this:
value="#{myBean.isBuyer ? myBean.pageProps.getPropertyValue('label.deviation.buyerdetails',  myBean.locale) : myBean.pageProps.getPropertyValue('label.deviation.transporterdetails',  myBean.locale)}"
...
value="#{myOtherBean.pageProperties.propertyValue('label.deviation.buyerdetails',  myBean.locale)}"

but i also have lines like this, that i don't want to match:
...
timeZone="#{myOtherBean.pageProperties.timeZone}" />

i want the resulting lines to look like this:
value="#{myBean.isBuyer ? msg['label.deviation.buyerdetails'] : msg['label.deviation.transporterdetails']}"
...
value="#{msg['label.deviation.transporterdetails']}"

to do this, i use a regexp search like this, in eclipse:
\w+Bean\.pageProp\w+\.\w+rop[^']+'([^']+)'[^\)]*\)
replace with:
msg\['\1'\]


##########SORT OF UNSOLVED MYSTERY ###############

here i have a java method:
    private String getPropertyValue(String property) {
        if(props == null)
            props = new RBUtils(Locale.ENGLISH, null);
        return RBUtils.getString(property, null, null);
    }


i want to find all java methods with that specific name and remove all lines before the return statement, like this:
    private String getPropertyValue(String property) {
        return RBUtils.getString(property, null, null);
    }


 
here's the regexp that took me about an hour to come up with:
search for:
(String\s+getPropertyValue\([^\{]+\{)[\s\S]+?(return\s+RBUtils)
replace with:
\1\n\2


explanation:
\s\S
matches any character

\n is just to place a newline in front of the return statement.

? is to stop greediness and force laziness.

UPDATE
this one will "fail" if you have the following code:
    private String getPropertyValue(String property){
        String s= RBUtils.getString(property, null, null);
        return s;
    }

    public String getPropertyLabelValue(String property){
        return RBUtils
.getString("label."+property, null, null);
    }


so i need to figure out how to stop the matching at the end of the method. one idea i have is to stop it at a public OR private, i.e. if it hasn't matched before it reaches a "public" or "private" string, then it's not a match.

let's try to put to words exactly what i want:
* i want to match within a method with the following signature:
String getPropertyValue(String name)

where name can be any string: \w+
and the preceding text is a string literal: (String\s+getPropertyValue\s*\(\s*String\s+)

so the whole first line is:
(String\s+getPropertyValue\s*\(\s*String\s+\w+)

the surrounding parentheses are a grouping, so we can save that text and reinsert it in the replace action by using what is called a backreference, like this:
\1

to end the matching at the end of the method is quite difficult though, if not impossible, illustrated by the following examples.

one way is to stop the matching at the first instance of a private or public:
    private String getPropertyValue(String property){
        String s= RBUtils.getString(property, null, null);
        return s;
    }

    public ...


(note: this doesn't match because we didn't find "return RBUtils")

but if you have the following code:
...
    private String getPropertyValue(String property){
        String s= RBUtils.getString(property, null, null);
        return s;
    }
} // end of the java class


there is no method after the method we're interested in--the class ends. one solution would be to somehow count the number of curly braces and stop at the last "}". i don't know how to do that or if that's possible, so someone please let me know =)

what i originally, specifically wanted was to match the following:
    private String getPropertyValue(String property) {
        if(props == null)
            props = new RBUtils(Locale.ENGLISH, null);
        return RBUtils.getString(property, null, null);
    }


so to solve this specific problem (instead of finding all methods with that signature and erasing any variation of code before the return statement--which currently seems impossible) i need to use the following regex:
(String\ +getPropertyValue\ *\(\ *String.+\s)(?:.+props.+\s){2}(.+\s.+\})
replace with:
\1\2

(?:          //group, but don't make a backreference
.+props.+\s          //match a whole line, including the newline (\s) that contains  "props"
)          //end grouping
{2}          //match that "props" line max 2 times

(.+\s.+\})          //match the whole "return" line, plus the line with the method-closing curly brace and put all this into a group, so we can backreference this second group with \2


#####################

today's fun example, 20120928, was about me figuring out that you can't use a "+" sign to dynamically concatenate label text and then get that label property value from a properties file. luckily i found this post that told me i could use EL 2.2's concat functionality.

so i need to convert all instances of this kind of JSF code:
#{msg['random-text-1'+random-text-2]}
to this:
#{msg['random-text-1-reinserted-here'.concat(random-text-2-reinserted-here)]}
search for:
(#\{\s*msg\[\s*'[^']+')\s*\+\s*(\w[^\]]+)
replace with:
\1\.concat\(\2\)


UPDATE 20131119

this regex searches for primefaces tags that have an update attribute, but are missing an update id for growl:
\bupdate\s*=[\s:a-zA-Z]*"(?!:growl)[^"]*


UPDATE 20131211

after migrating from richfaces to primefaces, i needed to remove all calendar attributes called "verticalOffset" (valid in rich:calendar, but not in p:calendar):

search/match:
verticalOffset\s*=\s*"-*[0-9]+"

replace with:
(nothing)


No comments:

Post a Comment