{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 10 minutes to CLX\n", "\n", "This is a short introduction to CLX geared mainly towards new users of the code.\n", "\n", "## What are these libraries?\n", "\n", "CLX (Cyber Log Accelerators) provides a simple API for security analysts, data scientists, and engineers to quickly get started applying RAPIDS to real-world cyber use cases. CLX uses the GPU dataframe ([cuDF](https://github.com/rapidsai/cudf)) and other RAPIDS packages to execute cybersecurity and information security workflows. The following packages are available:\n", "\n", "* analytics - Machine learning and statistics functionality\n", "* ip - IPv4 data translation and parsing\n", "* parsers - Cyber log Event parsing\n", "* io - Input and output features for a workflow\n", "* workflow - Workflow which receives input data and produces analytical output data\n", "* osi - Open source integration (VirusTotal, FarsightDB and Whois)\n", "* dns - TLD extraction\n", "\n", "\n", "## When to use CLX\n", "\n", "Use CLX to build your cyber data analytics workflows for a GPU-accelerated environmetn using RAPIDS. CLX contains common cyber and cyber ML functionality, such as log parsing for specific data sources, cyber data type parsing (e.g., IPv4), and DGA detection. CLX also provides the ability to integrate this functionality into a CLX workflow, which simplifies execution of the series of parsing and ML functions needed for end-to-end use cases.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Log Parsing \n", "\n", "CLX provides traditional parsers for some common log types.\n", "Here’s an example parsing a common [Windows Event Log](https://www.ultimatewindowssecurity.com/securitylog/encyclopedia/default.aspx) of event code type [4770](https://www.ultimatewindowssecurity.com/securitylog/encyclopedia/event.aspx?eventid=4770)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
service_information_service_idtarget_account_old_account_nameservice_service_namegroup_group_namechanged_attributes_account_expiresdetailed_authentication_information_key_lengthadditional_information_result_codeaccount_information_security_idchanged_attributes_user_account_controlprocess_information_caller_process_id...changed_attributes_old_uac_valueattributes_profile_pathattributes_user_account_controlaccount_for_which_logon_failed_account_domainaccount_whose_credentials_were_used_account_domainnew_logon_logon_guidservice_serverattributes_home_directoryfailure_information_statusfailure_information_sub_status
0...
\n", "

1 rows × 131 columns

\n", "
" ], "text/plain": [ " service_information_service_id target_account_old_account_name \\\n", "0 \n", "\n", " service_service_name group_group_name changed_attributes_account_expires \\\n", "0 \n", "\n", " detailed_authentication_information_key_length \\\n", "0 \n", "\n", " additional_information_result_code account_information_security_id \\\n", "0 \n", "\n", " changed_attributes_user_account_control \\\n", "0 \n", "\n", " process_information_caller_process_id ... changed_attributes_old_uac_value \\\n", "0 ... \n", "\n", " attributes_profile_path attributes_user_account_control \\\n", "0 \n", "\n", " account_for_which_logon_failed_account_domain \\\n", "0 \n", "\n", " account_whose_credentials_were_used_account_domain new_logon_logon_guid \\\n", "0 \n", "\n", " service_server attributes_home_directory failure_information_status \\\n", "0 \n", "\n", " failure_information_sub_status \n", "0 \n", "\n", "[1 rows x 131 columns]" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import cudf\n", "from clx.parsers.windows_event_parser import WindowsEventParser\n", "event = \"04/03/2019 11:58:59 AM\\\\nLogName=Security\\\\nSourceName=Microsoft Windows security auditing.\\\\nEventCode=5156\\\\nEventType=0\\\\nType=Information\\\\nComputerName=user234.test.com\\\\nTaskCategory=Filtering Platform Connection\\\\nOpCode=Info\\\\nRecordNumber=241754521\\\\nKeywords=Audit Success\\\\nMessage=The Windows Filtering Platform has permitted a connection.\\\\r\\\\n\\\\r\\\\nApplication Information:\\\\r\\\\n\\\\tProcess ID:\\\\t\\\\t4\\\\r\\\\n\\\\tApplication Name:\\\\tSystem\\\\r\\\\n\\\\r\\\\nNetwork Information:\\\\r\\\\n\\\\tDirection:\\\\t\\\\tInbound\\\\r\\\\n\\\\tSource Address:\\\\t\\\\t100.20.100.20\\\\r\\\\n\\\\tSource Port:\\\\t\\\\t138\\\\r\\\\n\\\\tDestination Address:\\\\t100.20.100.30\\\\r\\\\n\\\\tDestination Port:\\\\t\\\\t138\\\\r\\\\n\\\\tProtocol:\\\\t\\\\t17\\\\r\\\\n\\\\r\\\\nFilter Information:\\\\r\\\\n\\\\tFilter Run-Time ID:\\\\t0\\\\r\\\\n\\\\tLayer Name:\\\\t\\\\tReceive/Accept\\\\r\\\\n\\\\tLayer Run-Time ID:\\\\t44\"\n", "wep = WindowsEventParser()\n", "df = cudf.DataFrame()\n", "df['raw'] = [event]\n", "result_df = wep.parse(df, 'raw')\n", "result_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Cyber Data Types\n", "\n", "CLX provides the ability to work with different data types that are specific to cybersecurity, such as IPv4 and DNS. Here’s an example of how to get started." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### IPv4" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The [IPv4](https://en.wikipedia.org/wiki/IPv4) data type is still commonly used and present in log files. Below we demonstrate functionality. Additional operations are available in the `clx.ip` module." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Convert IPv4 values to integers" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 89088434\n", "1 1585596973\n", "dtype: int64\n" ] } ], "source": [ "import clx.ip\n", "import cudf\n", "df = cudf.Series([\"5.79.97.178\", \"94.130.74.45\"])\n", "result_df = clx.ip.ip_to_int(df)\n", "print(result_df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Check if IPv4 values are multicast" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 True\n", "1 True\n", "2 False\n", "dtype: bool\n" ] } ], "source": [ "import clx.ip\n", "import cudf\n", "df = cudf.Series([\"224.0.0.0\", \"239.255.255.255\", \"5.79.97.178\"])\n", "result_df = clx.ip.is_multicast(df)\n", "print(result_df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TLD Extraction\n", "CLX provides the ability to extract the TLD from the registered domain and subdomains of a URL, using the public suffix list." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
hostnamedomainsuffixsubdomain
0www.google.comgooglecomwww
1gmail.comgmailcom
2github.comgithubcom
3pandas.pydata.orgpydataorgpandas
4www.worldbank.org.kgworldbankorg.kgwww
5waiterrant.blogspot.comwaiterrantblogspot.com
6forums.news.cnn.com.accnncom.acforums.news
7forums.news.cnn.accnnacforums.news
8b.cnn.comcnncomb
9a.news.uknewsuka
10a.news.co.uknewsco.uka
11a.news.co.uknewsco.uka
12107-193-100-2.lightspeed.cicril.sbcglobal.netsbcglobalnet107-193-100-2.lightspeed.cicril
13a23-44-13-2.deploy.static.akamaitechnologies.comakamaitechnologiescoma23-44-13-2.deploy.static
\n", "
" ], "text/plain": [ " hostname domain \\\n", "0 www.google.com google \n", "1 gmail.com gmail \n", "2 github.com github \n", "3 pandas.pydata.org pydata \n", "4 www.worldbank.org.kg worldbank \n", "5 waiterrant.blogspot.com waiterrant \n", "6 forums.news.cnn.com.ac cnn \n", "7 forums.news.cnn.ac cnn \n", "8 b.cnn.com cnn \n", "9 a.news.uk news \n", "10 a.news.co.uk news \n", "11 a.news.co.uk news \n", "12 107-193-100-2.lightspeed.cicril.sbcglobal.net sbcglobal \n", "13 a23-44-13-2.deploy.static.akamaitechnologies.com akamaitechnologies \n", "\n", " suffix subdomain \n", "0 com www \n", "1 com \n", "2 com \n", "3 org pandas \n", "4 org.kg www \n", "5 blogspot.com \n", "6 com.ac forums.news \n", "7 ac forums.news \n", "8 com b \n", "9 uk a \n", "10 co.uk a \n", "11 co.uk a \n", "12 net 107-193-100-2.lightspeed.cicril \n", "13 com a23-44-13-2.deploy.static " ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import cudf\n", "from clx.dns import dns_extractor as dns\n", "\n", "input_df = cudf.DataFrame(\n", " {\n", " \"url\": [\n", " \"http://www.google.com\",\n", " \"gmail.com\",\n", " \"github.com\",\n", " \"https://pandas.pydata.org\",\n", " \"http://www.worldbank.org.kg/\",\n", " \"waiterrant.blogspot.com\",\n", " \"http://forums.news.cnn.com.ac/\",\n", " \"http://forums.news.cnn.ac/\",\n", " \"ftp://b.cnn.com/\",\n", " \"a.news.uk\",\n", " \"a.news.co.uk\",\n", " \"https://a.news.co.uk\",\n", " \"107-193-100-2.lightspeed.cicril.sbcglobal.net\",\n", " \"a23-44-13-2.deploy.static.akamaitechnologies.com\",\n", " ]\n", " }\n", ")\n", "output_df = dns.parse_url(input_df[\"url\"])\n", "output_df.head(14)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Machine Learning\n", "\n", "CLX offers machine learning and statistcs functions that are ready to integrate into your CLX workflow. \n", "\n", "#### Calculate Rolling Z-Score\n", "Calculate a rolling z-score on a given cuDF series." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " zscore\n", "0 null\n", "1 null\n", "2 null\n", "3 null\n", "4 null\n", "5 null\n", "6 2.374423424\n", "7 -0.645941275\n", "8 -0.683973734\n", "9 0.158832461\n", "10 1.847751909\n", "11 0.880026019\n", "12 -0.950835449\n", "13 -0.360593742\n", "14 0.111407599\n", "15 1.228914145\n", "16 -0.074966331\n", "17 -0.570321249\n", "18 0.327849973\n", "19 -0.934372308\n", "20 2.296828498\n", "21 1.282966989\n", "22 -0.795223674\n" ] } ], "source": [ "import clx.analytics.stats\n", "import cudf\n", "sequence = [3,4,5,6,1,10,34,2,1,11,45,34,2,9,19,43,24,13,23,10,98,84,10]\n", "series = cudf.Series(sequence)\n", "zscores_df = cudf.DataFrame()\n", "zscores_df['zscore'] = clx.analytics.stats.rzscore(series, 7)\n", "print(zscores_df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Workflows\n", "\n", "Now that we've demonstrated the basics of CLX , let's try to tie some of this functionality into a CLX workflow. A workflow is defined as a function that receives a cuDF dataframe, performs some operations on it, and then returns an output cuDF dataframe. In our use case, we decide to show how to parse raw WinEVT data within a workflow." ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
member_account_nameattributes_password_last_setservice_service_nameattributes_profile_pathaccount_information_security_idadditional_information_transited_servicesadditional_information_caller_computer_namenetwork_information_directionnew_logon_account_namechanged_attributes_home_drive...certificate_information_certificate_issuer_namenetwork_information_source_network_addressservice_information_service_nameprivilegesaccount_for_which_logon_failed_account_domainnetwork_information_network_addressservice_servernew_account_account_nameuser_account_nameattributes_user_account_control
0inbound...
\n", "

1 rows × 131 columns

\n", "
" ], "text/plain": [ " member_account_name attributes_password_last_set service_service_name \\\n", "0 \n", "\n", " attributes_profile_path account_information_security_id \\\n", "0 \n", "\n", " additional_information_transited_services \\\n", "0 \n", "\n", " additional_information_caller_computer_name network_information_direction \\\n", "0 inbound \n", "\n", " new_logon_account_name changed_attributes_home_drive ... \\\n", "0 ... \n", "\n", " certificate_information_certificate_issuer_name \\\n", "0 \n", "\n", " network_information_source_network_address service_information_service_name \\\n", "0 \n", "\n", " privileges account_for_which_logon_failed_account_domain \\\n", "0 \n", "\n", " network_information_network_address service_server new_account_account_name \\\n", "0 \n", "\n", " user_account_name attributes_user_account_control \n", "0 \n", "\n", "[1 rows x 131 columns]" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import cudf\n", "from clx.workflow.workflow import Workflow\n", "from clx.parsers.windows_event_parser import WindowsEventParser\n", "\n", "wep = WindowsEventParser()\n", "\n", "class LogParseWorkflow(Workflow):\n", " def workflow(self, dataframe):\n", " output = wep.parse(dataframe, \"raw\")\n", " return output\n", " \n", "input_df = cudf.DataFrame()\n", "input_df[\"raw\"] = [\"04/03/2019 11:58:59 AM\\\\nLogName=Security\\\\nSourceName=Microsoft Windows security auditing.\\\\nEventCode=5156\\\\nEventType=0\\\\nType=Information\\\\nComputerName=user234.test.com\\\\nTaskCategory=Filtering Platform Connection\\\\nOpCode=Info\\\\nRecordNumber=241754521\\\\nKeywords=Audit Success\\\\nMessage=The Windows Filtering Platform has permitted a connection.\\\\r\\\\n\\\\r\\\\nApplication Information:\\\\r\\\\n\\\\tProcess ID:\\\\t\\\\t4\\\\r\\\\n\\\\tApplication Name:\\\\tSystem\\\\r\\\\n\\\\r\\\\nNetwork Information:\\\\r\\\\n\\\\tDirection:\\\\t\\\\tInbound\\\\r\\\\n\\\\tSource Address:\\\\t\\\\t100.20.100.20\\\\r\\\\n\\\\tSource Port:\\\\t\\\\t138\\\\r\\\\n\\\\tDestination Address:\\\\t100.20.100.30\\\\r\\\\n\\\\tDestination Port:\\\\t\\\\t138\\\\r\\\\n\\\\tProtocol:\\\\t\\\\t17\\\\r\\\\n\\\\r\\\\nFilter Information:\\\\r\\\\n\\\\tFilter Run-Time ID:\\\\t0\\\\r\\\\n\\\\tLayer Name:\\\\t\\\\tReceive/Accept\\\\r\\\\n\\\\tLayer Run-Time ID:\\\\t44\"]\n", "lpw = LogParseWorkflow(name=\"my-log-parsing-workflow\")\n", "lpw.workflow(input_df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Workflow I/O\n", "\n", "A workflow can receive and output data from different locations, including CSV files and Kafka. To integrate I/O into your workflow, simply indicate your workflow configurations within a `workflow.yaml` file or define your configurations at instantiation within a python dictionary. \n", "The workflow class will first look for any configuration file here: \n", "\n", "* /etc/clx/[workflow-name]/workflow.yaml then\n", "* ~/.config/clx/[workflow-name]/workflow.yaml\n", "\n", "To learn more about workflow configurations visit the [CLX Workflow](./intro-clx-workflow.html) page" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To demonstrate the input functionality, we'll create a small CSV input file." ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [], "source": [ "import cudf\n", "input_df = cudf.DataFrame()\n", "input_df[\"raw\"] = [\"04/03/2019 11:58:59 AM\\\\nLogName=Security\\\\nSourceName=Microsoft Windows security auditing.\\\\nEventCode=5156\\\\nEventType=0\\\\nType=Information\\\\nComputerName=user234.test.com\\\\nTaskCategory=Filtering Platform Connection\\\\nOpCode=Info\\\\nRecordNumber=241754521\\\\nKeywords=Audit Success\\\\nMessage=The Windows Filtering Platform has permitted a connection.\\\\r\\\\n\\\\r\\\\nApplication Information:\\\\r\\\\n\\\\tProcess ID:\\\\t\\\\t4\\\\r\\\\n\\\\tApplication Name:\\\\tSystem\\\\r\\\\n\\\\r\\\\nNetwork Information:\\\\r\\\\n\\\\tDirection:\\\\t\\\\tInbound\\\\r\\\\n\\\\tSource Address:\\\\t\\\\t100.20.100.20\\\\r\\\\n\\\\tSource Port:\\\\t\\\\t138\\\\r\\\\n\\\\tDestination Address:\\\\t100.20.100.30\\\\r\\\\n\\\\tDestination Port:\\\\t\\\\t138\\\\r\\\\n\\\\tProtocol:\\\\t\\\\t17\\\\r\\\\n\\\\r\\\\nFilter Information:\\\\r\\\\n\\\\tFilter Run-Time ID:\\\\t0\\\\r\\\\n\\\\tLayer Name:\\\\t\\\\tReceive/Accept\\\\r\\\\n\\\\tLayer Run-Time ID:\\\\t44\"]\n", "input_df.to_csv(\"alert_data.csv\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, create and run the workflow." ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [], "source": [ "from clx.workflow.workflow import Workflow\n", "from clx.parsers.windows_event_parser import WindowsEventParser\n", "import os\n", "dirpath = os.getcwd()\n", "\n", "source = {\n", " \"type\": \"fs\",\n", " \"input_format\": \"csv\",\n", " \"input_path\": dirpath + \"alert_data.csv\",\n", " \"schema\": [\"raw\"],\n", " \"delimiter\": \",\",\n", " \"required_cols\": [\"raw\"],\n", " \"dtype\": [\"str\"],\n", " \"header\": 0\n", "}\n", "destination = {\n", " \"type\": \"fs\",\n", " \"output_format\": \"csv\",\n", " \"output_path\": dirpath + \"alert_data_output.csv\"\n", "}\n", "wep = WindowsEventParser()\n", "\n", "class LogParseWorkflow(Workflow):\n", " def workflow(self, dataframe):\n", " output = wep.parse(dataframe, \"raw\")\n", " return output\n", "\n", "lpw = LogParseWorkflow(source=source, destination=destination, name=\"my-log-parsing-workflow\")\n", "lpw.run_workflow()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Output data can be read directly from the resulting CSV file." ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['member_account_name,attributes_password_last_set,service_service_name,attributes_profile_path,account_information_security_id,additional_information_transited_services,additional_information_caller_computer_name,network_information_direction,new_logon_account_name,changed_attributes_home_drive,filter_information_layer_run_time_id,new_logon_security_id,additional_information_result_code,eventcode,changed_attributes_logon_hours,account_information_supplied_realm_name,additional_information_ticket_options,subject_security_id,detailed_authentication_information_key_length,changed_attributes_script_path,changed_attributes_display_name,detailed_authentication_information_transited_services,subject_logon_id,changed_attributes_sam_account_name,network_information_workstation_name,service_information_service_id,subject_account_name,account_information_user_id,new_logon_account_domain,attributes_user_workstations,account_locked_out_account_name,target_account_old_account_name,network_information_protocol,attributes_home_directory,attributes_logon_hours,group_group_domain,changed_attributes_allowedtodelegateto,changed_attributes_user_account_control,network_information_source_port,attributes_user_parameters,network_information_port,application_information_process_id,attributes_sid_history,attributes_new_uac_value,process_process_name,network_information_destination_port,changed_attributes_home_directory,group_security_id,member_security_id,user_account_domain,certificate_information_certificate_serial_number,account_whose_credentials_were_used_account_domain,attributes_account_expires,subject_account_domain,process_information_caller_process_id,process_process_id,target_server_additional_information,process_information_caller_process_name,logon_type,network_information_destination_address,account_whose_credentials_were_used_logon_guid,filter_information_layer_name,additional_information_ticket_encryption_type,network_information_source_address,target_account_account_domain,failure_information_status,failure_information_failure_reason,process_information_process_name,target_account_security_id,filter_information_filter_run_time_id,attributes_allowed_to_delegate_to,changed_attributes_sid_history,account_for_which_logon_failed_security_id,new_account_domain_name,detailed_authentication_information_logon_process,additional_information_privileges,account_information_account_name,user_security_id,process_information_process_id,network_information_client_port,certificate_information_certificate_thumbprint,target_server_target_server_name,attributes_primary_group_id,additional_information_pre_authentication_type,changed_attributes_old_uac_value,account_information_account_domain,account_whose_credentials_were_used_account_name,id,subject_logon_guid,attributes_sam_account_name,detailed_authentication_information_authentication_package,attributes_user_principal_name,target_account_new_account_name,computername,attributes_home_drive,changed_attributes_account_expires,target_account_account_name,application_information_application_name,changed_attributes_primary_group_id,additional_information_failure_code,time,failure_information_sub_status,attributes_display_name,new_account_security_id,changed_attributes_user_principal_name,new_logon_logon_guid,changed_attributes_user_workstations,account_information_logon_guid,new_logon_logon_id,attributes_old_uac_value,changed_attributes_new_uac_value,additional_information_expiration_time,changed_attributes_password_last_set,network_information_client_address,account_for_which_logon_failed_account_name,changed_attributes_profile_path,attributes_script_path,detailed_authentication_information_package_name_ntlm_only,group_group_name,changed_attributes_user_parameters,account_locked_out_security_id,certificate_information_certificate_issuer_name,network_information_source_network_address,service_information_service_name,privileges,account_for_which_logon_failed_account_domain,network_information_network_address,service_server,new_account_account_name,user_account_name,attributes_user_account_control\\n',\n", " ',,,,,,,inbound,,,44,,,5156,,,,,,,,,,,,,,,,,,,17,,,,,,138,,,4,,,,138,,,,,,,,,,,,,,100.20.100.30,,receive/accept,,100.20.100.20,,,,,,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,system,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,\\n']" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f = open('alert_data_output.csv', \"r\")\n", "f.readlines()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Open Source Threat Intelligence Integration\n", "Often it's beneficial to integrate open source threat intelligence with collected data. CLX includes the ability to query [VirusTotal](https://www.virustotal.com) and [FarsightDB](https://www.farsightsecurity.com) directly. An API key is necessary for both of these integrations.\n", "\n", "#### Prerequisites to get API key\n", "* Create an account with https://www.virustotal.com\n", "* Create an account with https://www.farsightsecurity.com" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from clx.osi.virus_total import VirusTotalClient\n", "vt_api_key=''\n", "vt_client = VirusTotalClient(api_key=vt_api_key)\n", "result = vt_client.url_scan([\"virustotal.com\"])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from clx.osi.farsight import FarsightLookupClient\n", "server='https://api.dnsdb.info'\n", "fs_api_key=''\n", "fs_client = FarsightLookupClient(server, fs_api_key, limit=1)\n", "result = fs_client.query_rrset(\"www.dnsdb.info\")" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[{'domain_name': 'NVIDIA.COM', 'registrar': 'Safenames Ltd', 'whois_server': 'whois.safenames.net', 'referral_url': None, 'updated_date': '04-23-2019 17:17:03,10-04-2013 20:01:01', 'creation_date': '04-20-1993 04:00:00', 'expiration_date': '04-21-2020 04:00:00', 'name_servers': 'DNS1.P09.NSONE.NET,DNS2.P09.NSONE.NET,NS5.DNSMADEEASY.COM,NS6.DNSMADEEASY.COM,NS7.DNSMADEEASY.COM', 'status': 'clientDeleteProhibited https://icann.org/epp#clientDeleteProhibited,clientTransferProhibited https://icann.org/epp#clientTransferProhibited,serverDeleteProhibited https://icann.org/epp#serverDeleteProhibited,serverTransferProhibited https://icann.org/epp#serverTransferProhibited,serverUpdateProhibited https://icann.org/epp#serverUpdateProhibited', 'emails': 'abuse@safenames.net,wadmpfvzi5ei@idp.email,hostmaster@safenames.net', 'dnssec': 'unsigned', 'name': 'Data protected, not disclosed', 'org': None, 'address': '2701 San Tomas Expressway', 'city': 'Santa Clara', 'state': 'CA', 'zipcode': '95050', 'country': 'US'}]\n" ] } ], "source": [ "from clx.osi.whois import WhoIsLookupClient\n", "whois_client = WhoIsLookupClient()\n", "whois_result = whois_client.whois([\"nvidia.com\"])\n", "print(whois_result)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" } }, "nbformat": 4, "nbformat_minor": 4 }