Skip links

Manipuler les chaînes de caractères sur R en Data Science

[tatsu_section bg_color= “” bg_image= “” bg_repeat= “no-repeat” bg_attachment= ‘{“d”:”scroll”}’ bg_position= ‘{“d”:”top left”}’ bg_size= ‘{“d”:”cover”}’ bg_animation= “none” padding= ‘{“d”:”90px 0px 90px 0px”}’ margin= ‘{“d”:”0px 0px 0px 0px”}’ bg_video_mp4_src= “” bg_video_ogg_src= “” bg_video_webm_src= “” overlay_color= “” overlay_blend_mode= “none” section_height_type= “auto” custom_height= ‘{“d”:””}’ vertical_align= “center” top_divider= “none” top_divider_zindex= “9999” bottom_divider_zindex= “9999” bottom_divider= “none” top_divider_height= ‘{“d”:”100″}’ top_divider_position= “above” bottom_divider_height= ‘{“d”:”100″}’ bottom_divider_position= “below” top_divider_color= “#ffffff” bottom_divider_color= “#ffffff” invert_top_divider= “0” invert_bottom_divider= “0” flip_top_divider= “0” flip_bottom_divider= “0” section_id= “” section_class= “” section_title= “” offset_value= “” full_screen_header_scheme= “background–dark” overflow= “” z_index= “0” hide_in= “0” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “” box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” key= “BJWbacMyBX”][tatsu_row full_width= “0” bg_color= “” border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” no_margin_bottom= “0” equal_height_columns= “0” gutter= “medium” column_spacing= “px” fullscreen_cols= “0” swap_cols= “0” padding= ‘{“d”:”0px 0px 0px 0px”}’ margin= ‘{“d”:”0px 0px”}’ row_id= “” row_class= “” box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” border_radius= “0px” hide_in= “0” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” layout= “1/1” key= “Hyeb6qzkS7”][tatsu_column bg_color= “” bg_image= “” bg_repeat= “no-repeat” bg_attachment= “scroll” bg_position= ‘{“d”:”top left”}’ bg_size= ‘{“d”:”cover”}’ padding= ‘{“d”:”0px 0px 0px 0px”}’ margin= ‘{“d”:””}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” box_shadow_custom= “0px 0px 0px 0px rgba(0,0,0,0)” bg_video_mp4_src= “” bg_video_ogg_src= “” bg_video_webm_src= “” overlay_color= “” overlay_blend_mode= “none” animate_overlay= “none” link_overlay= “” vertical_align= “none” sticky= “0” offset= ‘{“d”:”0px 0px”}’ column_parallax= “0” column_width= ‘{“d”:100,”l”:100,”t”:100,”m”:100}’ column_mobile_spacing= “0” image_hover_effect= “none” column_hover_effect= “none” hover_box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” overflow= “” col_id= “” column_class= “” top_divider= “none” top_divider_height= ‘{“d”:”100″,”m”:”0″}’ top_divider_color= “#ffffff” flip_top_divider= “0” top_divider_zindex= “9999” bottom_divider= “none” bottom_divider_height= ‘{“d”:”100″,”m”:”0″}’ bottom_divider_color= “#ffffff” flip_bottom_divider= “0” bottom_divider_zindex= “9999” left_divider= “none” left_divider_width= ‘{“d”:”50″,”m”:”0″}’ left_divider_color= “#ffffff” invert_left_divider= “0” left_divider_zindex= “9999” right_divider= “none” right_divider_width= ‘{“d”:”50″,”m”:”0″}’ right_divider_color= “#ffffff” invert_right_divider= “0” right_divider_zindex= “9999” z_index= “0” hide_in= “0” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” layout= “1/1” key= “H1ZacGySQ”][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “HyWpgF8B7”]

Lorsque nous analysons des données en Data Science, nous ne travaillons pas uniquement avec des chiffres. Nous nous retrouvons souvent à traiter des données sous forme de chaînes de caractères. Comme défini par Per Christensson (2006), une chaîne de caractères est un type de données utilisé en programmation, qui est utilisé pour représenter du texte plutôt que des nombres. Il est composé d’un ensemble de caractères pouvant également contenir des espaces, des chiffres et de nombreux symboles utilisés dans le travail quotidien (tirets, symboles monétaires, etc.).

[/tatsu_inline_text][/tatsu_column][/tatsu_row][tatsu_row full_width= “0” bg_color= “” border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” no_margin_bottom= “0” equal_height_columns= “0” gutter= “medium” column_spacing= “px” fullscreen_cols= “0” swap_cols= “0” padding= ‘{“d”:”0px 0px 0px 0px”}’ margin= ‘{“d”:”0px 0px”}’ row_id= “” row_class= “” box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” border_radius= “0px” hide_in= “0” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” layout= “1/1” key= “Sye6-bKLHX”][tatsu_column bg_color= “” bg_image= “” bg_repeat= “no-repeat” bg_attachment= “scroll” bg_position= ‘{“d”:”top left”}’ bg_size= ‘{“d”:”cover”}’ padding= ‘{“d”:”0px 0px 0px 0px “}’ margin= ‘{“d”:””}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” box_shadow_custom= “0px 0px 0px 0px rgba(0,0,0,0)” bg_video_mp4_src= “” bg_video_ogg_src= “” bg_video_webm_src= “” overlay_color= “” overlay_blend_mode= “none” animate_overlay= “none” link_overlay= “” vertical_align= “none” sticky= “0” offset= ‘{“d”:”0px 0px”}’ column_parallax= “0” column_width= ‘{“d”:100,”l”:100,”t”:100,”m”:100}’ column_mobile_spacing= “0” image_hover_effect= “none” column_hover_effect= “none” hover_box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” overflow= “” col_id= “” column_class= “” top_divider= “none” top_divider_height= ‘{“d”:”100″,”m”:”0″}’ top_divider_color= “#ffffff” flip_top_divider= “0” top_divider_zindex= “9999” bottom_divider= “none” bottom_divider_height= ‘{“d”:”100″,”m”:”0″}’ bottom_divider_color= “#ffffff” flip_bottom_divider= “0” bottom_divider_zindex= “9999” left_divider= “none” left_divider_width= ‘{“d”:”50″,”m”:”0″}’ left_divider_color= “#ffffff” invert_left_divider= “0” left_divider_zindex= “9999” right_divider= “none” right_divider_width= ‘{“d”:”50″,”m”:”0″}’ right_divider_color= “#ffffff” invert_right_divider= “0” right_divider_zindex= “9999” z_index= “0” hide_in= “0” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” layout= “1/1” key= “BJabWKLHX”][tatsu_image image= “https://www.10h11.com/wp-content/uploads/2020/07/datascience-chaines-de-caracteres.jpg” image_varying_size_src= “” alignment= “none” border_width= “0” border_color= “” id= “9258” size= “full” adaptive_image= “0” max_width= ‘{“d”:”100%”}’ rebel= “0” width= ‘{“d”:”100%”}’ shadow= “none” custom_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” drop_shadow= “drop-shadow(0px 0px 0px rgba(0,0,0,0))” border_radius= “0” lazy_load= “1” placeholder_bg= “” offset= ‘{“d”:”0px 0px”}’ lightbox= “0” link= “” new_tab= “0” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” padding= ‘{“d”:””}’ margin= ‘{“d”:””}’ key= “Tyt9qMgH1”][/tatsu_image][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 30px 0px”}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “HkiWH6IH7”]

Par conséquent, il semble évident que si nous voulons effectuer des analyses de données efficaces, il est nécessaire de savoir jouer avec les chaînes de caractères. Pour explorer cela, nous décidons d’utiliser R (https://www.r-project.org/). En effet, R est l’outil leader en Data Science. R est bien plus qu’un outil statistique, c’est un langage de programmation : nous pouvons créer nos propres objets, fonctions et packages.

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″,”m”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 30px 0px”}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “SJNdT01yH”]

Même si certains d’entre nous peuvent affirmer qu’il est peu intuitif en tant que langage de programmation (à la différence de Java ou Python, par exemple), R peut se révéler très précieux si nous savons comment l’utiliser.

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″,”m”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 30px 0px”}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “B1c_TAJ1B”]

Surtout, R concentre de nombreux avantages qui méritent toujours d’être rappelés. Facile d’installation, il est cross-platform : vous pouvez donc l’utiliser sur n’importe quel système d’exploitation. De plus, R est gratuit : vous pouvez l’utiliser dans n’importe quelle entreprise sans avoir à acheter une license. Il est également open-source : n’importe qui peut examiner le code source et potentiellement corriger des bugs et / ou ajouter des fonctionnalités.

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″,”m”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 30px 0px”}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “r1XFaR1JB”]

Avec R, le principal type de chaînes de caractères à utiliser prend la forme de “strings”.

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 30px 0px”}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “B1ob-C8BQ”]

Dans cet article, nous décrirons certains des outils les plus utiles en Data Science que nous utilisons ed manière régulière pour gérer et traiter les chaînes de caractères. À titre d’exemple pratique, nous utiliserons la base de données «state» qui accompagne R par défaut (et qui représente les ensembles de données relatifs aux 50 États des États-Unis d’Amérique).

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 30px 0px”}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “SJ4KL0JyH”]

is.character() and as.character()

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 30px 0px”}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “Symc8CkJH”]

Avant de commencer à travailler sur des strings, il est recommandé de vérifier si nous allons vraiment travailler avec des strings. Nous pouvons le faire avec la fonction «is.character()».

[/tatsu_inline_text][tatsu_image image= “https://www.10h11.com/wp-content/uploads/2019/06/pasted-image-0-23.png” image_varying_size_src= “” alignment= “center” border_width= “0” border_color= “” id= “6237” size= “full” adaptive_image= “0” max_width= ‘{“d”:”100%”}’ rebel= “0” width= ‘{“d”:”100%”}’ shadow= “none” custom_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” drop_shadow= “drop-shadow(0px 0px 0px rgba(0,0,0,0))” border_radius= “0” lazy_load= “1” placeholder_bg= “” offset= ‘{“d”:”0px 0px”}’ lightbox= “0” link= “” new_tab= “0” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” padding= ‘{“d”:””}’ margin= ‘{“d”:””}’ key= “SkcTpCyyH”][/tatsu_image][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 30px 0px”}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “SyXGZA8S7”]

Si, par exemple, nous devons convertir «state.area» (vecteur numérique des zones d’état – en miles carrés) en strings, nous pouvons facilement appliquer la fonction «as.character()».

[/tatsu_inline_text][tatsu_image image= “https://www.10h11.com/wp-content/uploads/2019/06/pasted-image-0-24.png” image_varying_size_src= “” alignment= “center” border_width= “0” border_color= “” id= “6238” size= “full” adaptive_image= “0” max_width= ‘{“d”:”100%”}’ rebel= “0” width= ‘{“d”:”100%”}’ shadow= “none” custom_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” drop_shadow= “drop-shadow(0px 0px 0px rgba(0,0,0,0))” border_radius= “0” lazy_load= “1” placeholder_bg= “” offset= ‘{“d”:”0px 0px”}’ lightbox= “0” link= “” new_tab= “0” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” padding= ‘{“d”:””}’ margin= ‘{“d”:””}’ key= “B1rZ0CkJB”][/tatsu_image][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 30px 0px”}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “ByHSW08SX”]

tolower() and toupper()

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 30px 0px”}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “HJsrC01Jr”]

Comme la plupart des langages de programmation, R trouve des différences entre les majuscules et les minuscules. Les fonctions «tolower()» et «toupper()» nous aident à comprendre ce problème.

[/tatsu_inline_text][tatsu_image image= “https://www.10h11.com/wp-content/uploads/2019/06/pasted-image-0-25.png” image_varying_size_src= “” alignment= “none” border_width= “0” border_color= “” id= “6239” size= “full” adaptive_image= “0” max_width= ‘{“d”:”100%”}’ rebel= “0” width= ‘{“d”:”100%”}’ shadow= “none” custom_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” drop_shadow= “drop-shadow(0px 0px 0px rgba(0,0,0,0))” border_radius= “0” lazy_load= “1” placeholder_bg= “” offset= ‘{“d”:”0px 0px”}’ lightbox= “0” link= “” new_tab= “0” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” padding= ‘{“d”:””}’ margin= ‘{“d”:””}’ key= “B1oUR0k1S”][/tatsu_image][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “SkW6WC8S7”]

nchar()

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “SylK0Rk1S”]

Il est possible que nous puissions avoir besoin de connaître la longueur de nos chaînes de caractères. La fonction «nchar()» nous permet de compter le nombre de caractères (de tous types) dans une chaîne spécifique.

[/tatsu_inline_text][tatsu_image image= “https://www.10h11.com/wp-content/uploads/2019/06/pasted-image-0-26.png” image_varying_size_src= “” alignment= “none” border_width= “0” border_color= “” id= “6240” size= “full” adaptive_image= “0” max_width= ‘{“d”:”100%”}’ rebel= “0” width= ‘{“d”:”100%”}’ shadow= “none” custom_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” drop_shadow= “drop-shadow(0px 0px 0px rgba(0,0,0,0))” border_radius= “0” lazy_load= “1” placeholder_bg= “” offset= ‘{“d”:”0px 0px”}’ lightbox= “0” link= “” new_tab= “0” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” padding= ‘{“d”:””}’ margin= ‘{“d”:””}’ key= “H1_Swg-yS”][/tatsu_image][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “BJqFCA11S”]

Par exemple, nous pourrions être intéressés à ne récupérer que les états composés de 6 lettres:

[/tatsu_inline_text][tatsu_image image= “https://www.10h11.com/wp-content/uploads/2019/06/pasted-image-0-27.png” image_varying_size_src= “” alignment= “none” border_width= “0” border_color= “” id= “6241” size= “full” adaptive_image= “0” max_width= ‘{“d”:”100%”}’ rebel= “0” width= ‘{“d”:”100%”}’ shadow= “none” custom_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” drop_shadow= “drop-shadow(0px 0px 0px rgba(0,0,0,0))” border_radius= “0” lazy_load= “1” placeholder_bg= “” offset= ‘{“d”:”0px 0px”}’ lightbox= “0” link= “” new_tab= “0” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” padding= ‘{“d”:””}’ margin= ‘{“d”:””}’ key= “rJ7s0AJyS”][/tatsu_image][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “Sy6vAaJJB”]

str_count()

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “HyRKDgW1r”]

Dans la même veine que la fonction «nchar()», nous introduisons une autre fonction qui nous permet d’obtenir le nombre d’occurrences d’un caractère spécifique dans une ou plusieurs chaînes : «str_count()». Voici un exemple pratique de la fonction considérant le caractère «k» (juste en minuscule):

[/tatsu_inline_text][tatsu_image image= “https://www.10h11.com/wp-content/uploads/2019/06/pasted-image-0-28.png” image_varying_size_src= “” alignment= “none” border_width= “0” border_color= “” id= “6242” size= “full” adaptive_image= “0” max_width= ‘{“d”:”100%”}’ rebel= “0” width= ‘{“d”:”100%”}’ shadow= “none” custom_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” drop_shadow= “drop-shadow(0px 0px 0px rgba(0,0,0,0))” border_radius= “0” lazy_load= “1” placeholder_bg= “” offset= ‘{“d”:”0px 0px”}’ lightbox= “0” link= “” new_tab= “0” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” padding= ‘{“d”:””}’ margin= ‘{“d”:””}’ key= “ry4QvAkkr”][/tatsu_image][tatsu_inline_text max_width= ‘{“d”:”100″,”m”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 30px 0px”}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “Hkklkkx1H”]

grep()

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″,”m”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 30px 0px”}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “B1l6PlWyB”]

Renversons le problème maintenant. Supposons que cette fois, nous devons sélectionner les états qui contiennent la lettre «w», à la fois en majuscule et en minuscule (comme nous l’avons déjà vu, R peut faire la différence entre les majuscules et les minuscules). La fonction «grep()» fonctionne parfaitement dans ce scénario:

[/tatsu_inline_text][tatsu_image image= “https://www.10h11.com/wp-content/uploads/2019/06/pasted-image-0-29.png” image_varying_size_src= “” alignment= “none” border_width= “0” border_color= “” id= “6243” size= “full” adaptive_image= “0” max_width= ‘{“d”:”100%”}’ rebel= “0” width= ‘{“d”:”100%”}’ shadow= “none” custom_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” drop_shadow= “drop-shadow(0px 0px 0px rgba(0,0,0,0))” border_radius= “0” lazy_load= “1” placeholder_bg= “” offset= ‘{“d”:”0px 0px”}’ lightbox= “0” link= “” new_tab= “0” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” padding= ‘{“d”:””}’ margin= ‘{“d”:””}’ key= “SJFRRAJ1S”][/tatsu_image][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “ryCOAa1yS”]

Puisque nous recherchions les états contenant à la fois «w» et «W», nous avons défini l’argument «pattern» avec les deux caractères, «wW». Remarquez la différence lors de la définition de l’argument “value” sur “FALSE” ou sur “TRUE” : dans le premier cas, nous obtenons les positions de l’élément dans le vecteur “state.name” des chaînes de caractères qui correspondent à notre recherche, dans le second nous avons directement leur «value».

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “HkYy_gWJB”]

paste()

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “HyYxOeW1r”]

Pendant que nous travaillons sur des chaînes de caractères, nous pourrions avoir besoin de combiner plusieurs strings en une seule chaîne. La fonction «paste()» recupère un ou plusieurs objets R, les convertit en caractères, puis les concatène pour ne former qu’une seule chaîne de caractères.

[/tatsu_inline_text][tatsu_image image= “https://www.10h11.com/wp-content/uploads/2019/06/pasted-image-0-30.png” image_varying_size_src= “” alignment= “none” border_width= “0” border_color= “” id= “6244” size= “full” adaptive_image= “0” max_width= ‘{“d”:”100%”}’ rebel= “0” width= ‘{“d”:”100%”}’ shadow= “none” custom_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” drop_shadow= “drop-shadow(0px 0px 0px rgba(0,0,0,0))” border_radius= “0” lazy_load= “1” placeholder_bg= “” offset= ‘{“d”:”0px 0px”}’ lightbox= “0” link= “” new_tab= “0” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” padding= ‘{“d”:””}’ margin= ‘{“d”:””}’ key= “rJuz1yg1H”][/tatsu_image][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “BkQYDRkyH”]

Dans l’exemple ci-dessus, nous avons combiné des objets de 5 caractères et les avons séparés par une virgule et un espace. Comme nous pouvons le voir, l’argument «sep» est une chaîne de caractères utilisée comme séparateur. D’un autre côté, si nous voulons combiner des chaînes qui appartiennent à un seul vecteur de caractères, nous utiliserons l’argument «collapse» :

[/tatsu_inline_text][tatsu_image image= “https://www.10h11.com/wp-content/uploads/2019/06/pasted-image-0-31.png” image_varying_size_src= “” alignment= “none” border_width= “0” border_color= “” id= “6245” size= “full” adaptive_image= “0” max_width= ‘{“d”:”100%”}’ rebel= “0” width= ‘{“d”:”100%”}’ shadow= “none” custom_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” drop_shadow= “drop-shadow(0px 0px 0px rgba(0,0,0,0))” border_radius= “0” lazy_load= “1” placeholder_bg= “” offset= ‘{“d”:”0px 0px”}’ lightbox= “0” link= “” new_tab= “0” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” padding= ‘{“d”:””}’ margin= ‘{“d”:””}’ key= “BkuQdeZJS”][/tatsu_image][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “SJUHyJxyS”]

Maintenant, remarquons la différence entre ces exemples suivants :

[/tatsu_inline_text][tatsu_image image= “https://www.10h11.com/wp-content/uploads/2019/06/pasted-image-0-32.png” image_varying_size_src= “” alignment= “none” border_width= “0” border_color= “” id= “6246” size= “full” adaptive_image= “0” max_width= ‘{“d”:”100%”}’ rebel= “0” width= ‘{“d”:”100%”}’ shadow= “none” custom_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” drop_shadow= “drop-shadow(0px 0px 0px rgba(0,0,0,0))” border_radius= “0” lazy_load= “1” placeholder_bg= “” offset= ‘{“d”:”0px 0px”}’ lightbox= “0” link= “” new_tab= “0” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” padding= ‘{“d”:””}’ margin= ‘{“d”:””}’ key= “r1ar_xZ1B”][/tatsu_image][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “r1pSJyl1S”]

Dans le premier cas, grâce à “sep”, nous combinons chaque élément du vecteur “statesWith6characters” avec la chaîne unique “is a state”, séparée uniquement par un espace. En conséquence, nous avons un nouveau vecteur de caractères, toujours formé de 5 éléments qui sont désormais la combinaison entre chaque élément de «statesWith6characters» et «is a state». Dans le second exemple, nous combinons toujours chaque élément de «statesWith6characters» avec «is a state» (là encore séparé par un espace), mais cette fois nous créons un seul élément de chaîne, et non pas un vecteur.

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “B1D8y1lJB”]

Afin d’obtenir un résultat plus clair, nous pouvons travailler avec les deux arguments «sep» et «collapse» de cette manière :

[/tatsu_inline_text][tatsu_image image= “https://www.10h11.com/wp-content/uploads/2019/06/pasted-image-0-33.png” image_varying_size_src= “” alignment= “none” border_width= “0” border_color= “” id= “6247” size= “full” adaptive_image= “0” max_width= ‘{“d”:”100%”}’ rebel= “0” width= ‘{“d”:”100%”}’ shadow= “none” custom_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” drop_shadow= “drop-shadow(0px 0px 0px rgba(0,0,0,0))” border_radius= “0” lazy_load= “1” placeholder_bg= “” offset= ‘{“d”:”0px 0px”}’ lightbox= “0” link= “” new_tab= “0” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” padding= ‘{“d”:””}’ margin= ‘{“d”:””}’ key= “Sym5PRJkB”][/tatsu_image][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “ByW5R6kJr”]

strsplit()

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “SJzqOeZyr”]

La fonction opposée de «paste()» est «strsplit()». «Strsplit()» nous permet de diviser une ou plusieurs chaînes en chaînes plus courtes.

[/tatsu_inline_text][tatsu_image image= “https://www.10h11.com/wp-content/uploads/2019/06/pasted-image-0-34.png” image_varying_size_src= “” alignment= “none” border_width= “0” border_color= “” id= “6248” size= “full” adaptive_image= “0” max_width= ‘{“d”:”100%”}’ rebel= “0” width= ‘{“d”:”100%”}’ shadow= “none” custom_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” drop_shadow= “drop-shadow(0px 0px 0px rgba(0,0,0,0))” border_radius= “0” lazy_load= “1” placeholder_bg= “” offset= ‘{“d”:”0px 0px”}’ lightbox= “0” link= “” new_tab= “0” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” padding= ‘{“d”:””}’ margin= ‘{“d”:””}’ key= “Bybi_l-JH”][/tatsu_image][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “BJd6PR1kB”]

Remarquez l’utilisation de l’argument «split» comme délimiteur dans le fractionnement des chaînes : comme un seul espace dans le premier cas, comme la lettre «w» dans le second.

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “S1hCCakJS”]

sub() and gsub()

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “SkfpOgbyB”]

Nous pourrions également avoir la nécessité de remplacer un caractère particulier dans une chaîne de caractères. Pour cela, il existe les fonctions «sub()» et «gsub()».

[/tatsu_inline_text][tatsu_image image= “https://www.10h11.com/wp-content/uploads/2019/06/pasted-image-0-35.png” image_varying_size_src= “” alignment= “none” border_width= “0” border_color= “” id= “6249” size= “full” adaptive_image= “0” max_width= ‘{“d”:”100%”}’ rebel= “0” width= ‘{“d”:”100%”}’ shadow= “none” custom_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” drop_shadow= “drop-shadow(0px 0px 0px rgba(0,0,0,0))” border_radius= “0” lazy_load= “1” placeholder_bg= “” offset= ‘{“d”:”0px 0px”}’ lightbox= “0” link= “” new_tab= “0” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” padding= ‘{“d”:””}’ margin= ‘{“d”:””}’ key= “ByajJyeJB”][/tatsu_image][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “Hyq-OCyJS”]

Nous pouvons facilement vérifier la différence entre les deux fonctions: tandis que «sub ()» remplace uniquement la première occurrence d’un pattern dans une chaîne de caractères (le premier «a» dans notre exemple), «gsub ()» remplace toutes les occurrences (donc , tous les «A» contenus dans la chaîne «AlAbAmA»).

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “HkWHktx-kr”]

substr()

[/tatsu_inline_text][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “rJHytxbJS”]

Supposons, cette fois, que nous n’ayons pas besoin de remplacer des caractères particuliers, mais que nous avons juste besoin d’extraire une partie spécifique d’une chaîne. Avec la fonction «substr()», nous pouvons, par exemple, extraire les 4 premières lettres de tous les noms des états. Voici comment cela fonctionne:

[/tatsu_inline_text][tatsu_image image= “https://www.10h11.com/wp-content/uploads/2019/06/pasted-image-0-36.png” image_varying_size_src= “” alignment= “none” border_width= “0” border_color= “” id= “6250” size= “full” adaptive_image= “0” max_width= ‘{“d”:”100%”}’ rebel= “0” width= ‘{“d”:”100%”}’ shadow= “none” custom_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” drop_shadow= “drop-shadow(0px 0px 0px rgba(0,0,0,0))” border_radius= “0” lazy_load= “1” placeholder_bg= “” offset= ‘{“d”:”0px 0px”}’ lightbox= “0” link= “” new_tab= “0” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” padding= ‘{“d”:””}’ margin= ‘{“d”:””}’ key= “Sy71xyx1r”][/tatsu_image][tatsu_inline_text max_width= ‘{“d”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 50px 0px “}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “Bkd7_01yB”]

Comme nous pouvons le voir dans l’exemple ci-dessus, nous avons 4 cas particuliers: “New”, “New“, “New“, “New“. Ces éléments correspondent simplement à “New Hampshire”, “New Jersey”, “New Mexico”, “New York” : dans toutes ces chaînes de caractères, la 4ème lettre est représentée par un espace qui, dans ce contexte, est évidemment considéré comme un caractère. Nous pouvons le souligner encore mieux si nous considérons les 5 premières lettres:

[/tatsu_inline_text][tatsu_image image= “https://www.10h11.com/wp-content/uploads/2019/06/pasted-image-0-37.png” image_varying_size_src= “” alignment= “none” border_width= “0” border_color= “” id= “6251” size= “full” adaptive_image= “0” max_width= ‘{“d”:”100%”}’ rebel= “0” width= ‘{“d”:”100%”}’ shadow= “none” custom_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” drop_shadow= “drop-shadow(0px 0px 0px rgba(0,0,0,0))” border_radius= “0” lazy_load= “1” placeholder_bg= “” offset= ‘{“d”:”0px 0px”}’ lightbox= “0” link= “” new_tab= “0” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” padding= ‘{“d”:””}’ margin= ‘{“d”:””}’ key= “rkS7txbyr”][/tatsu_image][tatsu_inline_text max_width= ‘{“d”:”100″,”m”:”100″}’ wrap_alignment= “center” text_alignment= ‘{“d”:”left”}’ margin= ‘{“d”:”0px 0px 30px 0px”}’ bg_color= “” typography= ‘{“d”:””}’ box_shadow= “0px 0px 0px 0px rgba(0,0,0,0)” padding= ‘{“d”:”0px 0px 0px 0px”}’ border_style= ‘{“d”:”solid”,”l”:”solid”,”t”:”solid”,”m”:”solid”}’ border= ‘{“d”:”0px 0px 0px 0px”}’ border_color= “” border_radius= “0px” hide_in= “” css_id= “” css_classes= “” animate= “1” animation_type= “none” animation_delay= “0” animation_duration= “300” key= “B1K-cCJ1H”]

 
References:
  • Christensson, Per. “String Definition.” TechTerms. (2006). Accessed Nov 23, 2015. http://techterms.com/definition/string.
  • R Documentation, US State Facts and Figures – https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/state.html
  • Sanchez, G. (2013) Handling and Processing Strings in R – Trowchez Editions. Berkeley, 2013 – http://www.gastonsanchez.com/Handling and Processing Strings in R.pdf
  • Ulrich, Joshua. “Why Use R?” (2010) – http://www.r-bloggers.com/why-use-r/

[/tatsu_inline_text][/tatsu_column][/tatsu_row][/tatsu_section]