Jump to content
Sign in to follow this  
mnajem

Regular Expression Untuk C

Recommended Posts

Salam,

Saya nak enable PCRE usage untuk string manipulation/search guna C. Tengok kat internet banyak scripting yg guna,tapi tidak untuk C.

Kebiasaannya string manipulation guna stcmp dan strtok kan.

Contoh di sini:

http://www.nathanr.net/programming/hints/pcre.shtml

Beliau enablekan pcre.h pada header (saya boleh run kod beliau,dengan syarat ada libpcre<version>-dev).

Untuk function:

pcre_compile

pcre_exec

Bagaimana sintaks supaya saya boleh mintak pengguna masukkan pattern & kemudian pcre_compile & pcre_exec akan interpret?

Share this post


Link to post
Share on other sites

berdasarkan contoh yg diberi,

re = pcre_compile("(\\S+)\\s*:\\s*(\\S+)", 0, &error, &erroffset, NULL); 
matches = pcre_exec(re, NULL , subject, strlen(subject), 0, 0, ovector, 30);
pattern yg dikehendaki adalah berbentuk c-string - ("(\\S+)\\s*:\\s*(\\S+)", maknanya, kita boleh ganti dgn string kita sendiri melalui pointer.
printf("Sila masukkan pattern: ");
fgets(szPattern,75,stdin);
re = pcre_compile(szPattern, 0, &error, &erroffset, NULL);

Share this post


Link to post
Share on other sites

possible tak kalau kita buat satu file .txt yang ada rules & minta dia interpret.

say dalam satu file;

ayam.txt

^[a-z][A-Z][0-9]akak

^[x-z]orro

guna fopen ke eh? -aku dah lama tak buat programming.

-zeph,pada pendapat ko, exact string matching macam Boyer Moore,KMP,Zhu Takoaka dengan Regular Expression mana yang pantas cari string/manipulate.

atau dalam PCRE engine ni pun guna string search algorithm jugak?

sebab aku perasan Snort & GNU Grep guna Boyer Moore. Tapi dia accept Regexp sebagai pattern .

aku sebenarnye cuba nak implement so called extended version KMP dengan accept Regular Expression,tapi tak tau buat mcm mana,sebab say pattern macam ni:

ATAFHAH AHFTHA HAHTHAHH

aku nak fetch guna "exact string matching" tak boleh guna wildcard i.e:

AT*

sebaliknya string matching algo ni accept raw text i.e

ATAF (cuma dia akan skip kalo ada mismatch).

Edited by mnajem

Share this post


Link to post
Share on other sites

possible tak kalau kita buat satu file .txt yang ada rules & minta dia interpret.

say dalam satu file;

ayam.txt

^[a-z][A-Z][0-9]akak

^[x-z]orro

guna fopen ke eh? -aku dah lama tak buat programming.

memang kena pakai fopen kalau nak simpan regexp string dalam file. kena ada sedikit string line dan memory allocation.

-zeph,pada pendapat ko, exact string matching macam Boyer Moore,KMP,Zhu Takoaka dengan Regular Expression mana yang pantas cari string/manipulate.

mungkin anda nak pakai http://en.wikipedia.org/wiki/Fuzzy_string_searching

atau dalam PCRE engine ni pun guna string search algorithm jugak?

sebab aku perasan Snort & GNU Grep guna Boyer Moore. Tapi dia accept Regexp sebagai pattern .

aku sebenarnye cuba nak implement so called extended version KMP dengan accept Regular Expression,tapi tak tau buat mcm mana,sebab say pattern macam ni:

ATAFHAH AHFTHA HAHTHAHH

aku nak fetch guna "exact string matching" tak boleh guna wildcard i.e:

AT*

sebaliknya string matching algo ni accept raw text i.e

ATAF (cuma dia akan skip kalo ada mismatch).

aku tak pernah guna algo semacam ini, oleh itu, tak dapat bagi jawapan yg tepat lah :D

kalau masih nak pakai PRCE ni, boleh guna dirty n quick solution (slow bruteforcing). apply regexp bagi setiap string yg dibaca dari sumber yg dikehendaki.

contoh secara pseudo:

1. regexp ('ATAFHAH', AT*) if true, found

2. regexp ('AHFTHA', AT*) if true, found

....

Share this post


Link to post
Share on other sites

#include <pcre.h> 
#include <string.h> 
int main(int argc, char **argv) 
{ pcre *re = NULL; pcre_extra *pe = NULL; 
const char *error = NULL; 
int erroffset; 
int ovector[75]; 
int matches; 
const char *match_string; 
const char *subject; 
//const char *subject = "foo: bar"; 
int x; 
int stdin;
int szPattern;

        printf("Sila masukkan pattern: ");
        fgets(szPattern,75,stdin);
        re = pcre_compile(szPattern, 0, &error, &erroffset, NULL);
//      re = pcre_compile("(\\S+)\\s*:\\s*(\\S+)", 0, &error, &erroffset, NULL); 
        matches = pcre_exec(re, NULL , subject, strlen(subject), 0, 0, ovector, 100); 
        printf("subject=%s\", matches=%d\n", subject, matches); 

        for (x=0; x < matches; x++) 
        { pcre_get_substring(subject, ovector, matches, x, &match_string); 
                printf("match %d: \"%s\")\n", x, match_string); pcre_free_substring(match_string); 
        } 

        return 0;

}
aku dapat error ni:
 >gcc mycode.c  -o mycode -lpcre
mycode.c: In function ‘main’:
mycode.c:16: warning: incompatible implicit declaration of built-in function ‘printf’
mycode.c:18: warning: passing argument 1 of ‘pcre_compile’ makes pointer from integer without a cast

[33] >./mycode 
Sila masukkan pattern: subject=�$�,j", matches=-1
Segmentation fault (core dumped)

--time compile warning saja.. tapi segementation fault bila run.

Share this post


Link to post
Share on other sites

problemnya, int szPattern; ->> char szPattern[80]={0};

kalau ada problem lain2, aku tak dapat tolong sebab

zeph@darkthrone:~/pcre$ ./test 
./test: error while loading shared libraries: libpcre.so.0: cannot open shared object file: No such file or directory
zeph@darkthrone:~/pcre$ whereis libpcre.so
libpcre: /usr/local/lib/libpcre.a /usr/local/lib/libpcre.so /usr/local/lib/libpcre.la

:(

Share this post


Link to post
Share on other sites

whereis libpcre
libpcre: /usr/lib/libpcre.a /usr/lib/libpcre.so
aku punye libpcre aku dah ubah code tu. ada warning je bila compile & keluar binary. masalahnya ialah program tu tergantung takde output kalo nak kasi output debug guna ltrace kan?
ltrace ./mycode 

__libc_start_main(0x8048564, 1, 0xbf967964, 0x8048730, 0x8048720 <unfinished ...>
memset(0xbf96786c, '00', 80)                                                                   = 0xbf96786c
printf("Sila masukkan pattern: ")                                                                = 23

Edited by mnajem

Share this post


Link to post
Share on other sites

apabila kita nak menggunakan library yg ditulis org lain (bukan standard c/c++ library) kita kena baca la manual library tersebut. disini reference kepada pcre_exec().

#include <pcre.h>

int pcre_exec(const pcre *code, const pcre_extra *extra,

const char *subject, int length, int startoffset,

int options, int *ovector, int ovecsize);

DESCRIPTION

This function matches a compiled regular expression against a given subject string, using a matching algorithm that is similar to Perl's. It returns offsets to captured substrings. Its arguments are:

code Points to the compiled pattern

extra Points to an associated pcre_extra structure,

or is NULL

subject Points to the subject string

length Length of the subject string, in bytes

startoffset Offset in bytes in the subject at which to

start matching

options Option bits

ovector Points to a vector of ints for result offsets

ovecsize Number of elements in the vector (a multiple of 3)

awak pulak buat

const char *subject;
//const char *subject = "foo: bar";

maksudnya subject awak kosong, mmg takde result lah :)

Share this post


Link to post
Share on other sites

cuba code ni

#include <pcre/pcre.h>
#include <string.h>
int main(int argc, char **argv)
{
    pcre *re = NULL;
    pcre_extra *pe = NULL;
    const char *error = NULL;
    int erroffset;
    int ovector[30];
    int matches;
    const char *match_string;
    const char *subject = "bog : log";
    int x;
    
    char szPattern[76]={0};

    printf("Sila masukkan pattern: ");
    fgets(szPattern,75,stdin);

    re = pcre_compile(szPattern, 0, &error, &erroffset, NULL);
    matches = pcre_exec(re, NULL , subject, strlen(subject), 0, 0, ovector, 30);
    printf("subject=\"%s\", matches=%d\n", subject, matches);

    for (x=0; x < matches; x++)
    {
        pcre_get_substring(subject, ovector, matches, x, &match_string);
        printf("match %d: \"%s\")\n", x, match_string);
        pcre_free_substring(match_string);
    }

    return 0;
}

sila gunakan input sebagai

(\\S+)\\s*:\\s*(\\S+)
dan boleh tunjuk kat aku result dia sebab aku tak reti nak guna regexp syntax.

Share this post


Link to post
Share on other sites

 gcc zephpcre.c -o zephpcre -lpcre
zephpcre.c: In function ‘main’:
zephpcre.c:17: warning: incompatible implicit declaration of built-in function ‘printf’
zephpcre.c:18: error: ‘stdin’ undeclared (first use in this function)
zephpcre.c:18: error: (Each undeclared identifier is reported only once
zephpcre.c:18: error: for each function it appears in.)

header tu mula2 takleh guna dua-dua

zephpcre.c:1:23: error: pcre/pcre.h: No such file or directory

yang betul pcre.h

Edited by mnajem

Share this post


Link to post
Share on other sites

Apa kata tambah stdio.h.

#include <stdio.h>

Edit: Hehe, aku reply thread lama rupanya, tak perasan

Edited by mchammer

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
Sign in to follow this  

×
×
  • Create New...