Index

B C D E F H I J L M N O P R S T U V W 
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form

B

body() - Method in record class org.jweaver.crawler.internal.result.ResponseData
Returns the value of the body record component.
build(Set<String>) - Method in class org.jweaver.crawler.internal.runner.JWeaverBuilderImpl
 
build(Set<String>) - Method in interface org.jweaver.crawler.JWeaverCrawler.Builder
Builds and returns a new instance of JWeaverCrawler with the configured parameters.
builder() - Static method in interface org.jweaver.crawler.JWeaverCrawler
Returns a new instance of the builder for configuring and creating a JWeaverCrawler.
BuilderValidator - Class in org.jweaver.crawler.internal.util
The BuilderValidator class provides utility methods for validating builder parameters.

C

characters() - Method in record class org.jweaver.crawler.internal.result.Metadata
Returns the value of the characters record component.
child() - Method in record class org.jweaver.crawler.internal.result.Connection
Returns the value of the child record component.
Connection - Record Class in org.jweaver.crawler.internal.result
The Connection record represents a connection between a parent URI and a child URI, along with the depth of the connection.
Connection(String, String, int) - Constructor for record class org.jweaver.crawler.internal.result.Connection
Creates an instance of a Connection record class.
CONNECTIONS_PREFIX - Static variable in class org.jweaver.crawler.internal.util.Constants
The prefix for connections.
Constants - Class in org.jweaver.crawler.internal.util
This class contains constants used throughout the crawling process.
content() - Method in record class org.jweaver.crawler.internal.result.ErrorResultPage
Returns the value of the content record component.
content() - Method in interface org.jweaver.crawler.internal.result.ResultPage
Returns the content of the result page.
content() - Method in record class org.jweaver.crawler.internal.result.SuccessResultPage
Returns the value of the content record component.
CONTENT_TYPE_STR - Static variable in class org.jweaver.crawler.internal.util.Constants
The header key for specifying the content type.
create() - Static method in class org.jweaver.crawler.internal.runner.TaskExecutorImpl
Creates a new instance of TaskExecutorImpl.
create() - Static method in class org.jweaver.crawler.internal.write.JWeaverFileWriter
Constructs a new JWeaverFileWriter instance.
create(PageLink, String) - Static method in record class org.jweaver.crawler.internal.result.ErrorResultPage
Creates an ErrorResultPage instance based on the provided PageLink and error content.
create(PageLink, String, String, Set<PageLink>) - Static method in record class org.jweaver.crawler.internal.result.SuccessResultPage
Creates a SuccessResultPage instance based on the provided PageLink, title, content, and link set.

D

DEFAULT_OUTPUT_PATH - Static variable in class org.jweaver.crawler.internal.util.Constants
The default output path for file export.
depth() - Method in record class org.jweaver.crawler.internal.result.Connection
Returns the value of the depth record component.
depth() - Method in record class org.jweaver.crawler.internal.result.ErrorResultPage
Returns the value of the depth record component.
depth() - Method in record class org.jweaver.crawler.internal.result.Metadata
Returns the value of the depth record component.
depth() - Method in record class org.jweaver.crawler.internal.result.NodeError
Returns the value of the depth record component.
depth() - Method in record class org.jweaver.crawler.internal.result.PageLink
Returns the value of the depth record component.
depth() - Method in record class org.jweaver.crawler.internal.result.SuccessResultPage
Returns the value of the depth record component.
DocumentParser - Interface in org.jweaver.crawler.internal.parse
The DocumentParser interface defines methods for extracting relevant information from HTML documents.

E

equals(Object) - Method in record class org.jweaver.crawler.internal.result.Connection
Indicates whether some other object is "equal to" this one.
equals(Object) - Method in record class org.jweaver.crawler.internal.result.ErrorResultPage
Indicates whether some other object is "equal to" this one.
equals(Object) - Method in record class org.jweaver.crawler.internal.result.Metadata
Indicates whether some other object is "equal to" this one.
equals(Object) - Method in record class org.jweaver.crawler.internal.result.NodeError
Indicates whether some other object is "equal to" this one.
equals(Object) - Method in record class org.jweaver.crawler.internal.result.PageLink
Indicates whether some other object is "equal to" this one.
equals(Object) - Method in record class org.jweaver.crawler.internal.result.ResponseData
Indicates whether some other object is "equal to" this one.
equals(Object) - Method in record class org.jweaver.crawler.internal.result.SuccessResultPage
Indicates whether some other object is "equal to" this one.
equals(Object) - Method in record class org.jweaver.crawler.internal.write.JsonExportConfig
Indicates whether some other object is "equal to" this one.
equals(Object) - Method in record class org.jweaver.crawler.internal.write.MarkdownExportConfig
Indicates whether some other object is "equal to" this one.
error() - Method in record class org.jweaver.crawler.internal.result.NodeError
Returns the value of the error record component.
ErrorResultPage - Record Class in org.jweaver.crawler.internal.result
The ErrorResultPage record represents a result page containing an error encountered during web crawling.
ErrorResultPage(String, int, String) - Constructor for record class org.jweaver.crawler.internal.result.ErrorResultPage
Creates an instance of a ErrorResultPage record class.
ERRORS_PREFIX - Static variable in class org.jweaver.crawler.internal.util.Constants
The prefix for errors.
ExportConfig - Interface in org.jweaver.crawler.internal.write
The ExportConfig interface defines methods for configuring data export options.
exportConfiguration(ExportConfig) - Method in class org.jweaver.crawler.internal.runner.JWeaverBuilderImpl
 
exportConfiguration(ExportConfig) - Method in interface org.jweaver.crawler.JWeaverCrawler.Builder
Sets the export configuration for configuring data export options.
exportDefault() - Static method in interface org.jweaver.crawler.internal.write.ExportConfig
Creates and returns a new ExportConfig instance with default settings for exporting data in Markdown format.
ExportFileFormat - Enum Class in org.jweaver.crawler.internal.write
The ExportFileFormat enum represents the file formats supported for data export.
exportJson(String, boolean) - Static method in interface org.jweaver.crawler.internal.write.ExportConfig
Creates and returns a new ExportConfig instance configured for exporting data in JSON format.
exportMarkdown(String) - Static method in interface org.jweaver.crawler.internal.write.ExportConfig
Creates and returns a new ExportConfig instance configured for exporting data in Markdown format.
extension() - Method in enum class org.jweaver.crawler.internal.write.ExportFileFormat
Returns the file extension associated with the file format.

F

FILE_EXPORT_DT_FORMAT - Static variable in class org.jweaver.crawler.internal.util.Constants
The date-time format for file export.
FileUtils - Class in org.jweaver.crawler.internal.util
This utility class provides file-related operations.
format() - Method in interface org.jweaver.crawler.internal.write.ExportConfig
Returns the file format for exported data.
format() - Method in record class org.jweaver.crawler.internal.write.JsonExportConfig
Retrieves the export file format, which is JSON for this configuration.
format() - Method in record class org.jweaver.crawler.internal.write.MarkdownExportConfig
Retrieves the export file format, which is Markdown for this configuration.

H

hashCode() - Method in record class org.jweaver.crawler.internal.result.Connection
Returns a hash code value for this object.
hashCode() - Method in record class org.jweaver.crawler.internal.result.ErrorResultPage
Returns a hash code value for this object.
hashCode() - Method in record class org.jweaver.crawler.internal.result.Metadata
Returns a hash code value for this object.
hashCode() - Method in record class org.jweaver.crawler.internal.result.NodeError
Returns a hash code value for this object.
hashCode() - Method in record class org.jweaver.crawler.internal.result.PageLink
Returns a hash code value for this object.
hashCode() - Method in record class org.jweaver.crawler.internal.result.ResponseData
Returns a hash code value for this object.
hashCode() - Method in record class org.jweaver.crawler.internal.result.SuccessResultPage
Returns a hash code value for this object.
hashCode() - Method in record class org.jweaver.crawler.internal.write.JsonExportConfig
Returns a hash code value for this object.
hashCode() - Method in record class org.jweaver.crawler.internal.write.MarkdownExportConfig
Returns a hash code value for this object.
httpClient(HttpClient) - Method in class org.jweaver.crawler.internal.runner.JWeaverBuilderImpl
 
httpClient(HttpClient) - Method in interface org.jweaver.crawler.JWeaverCrawler.Builder
Sets the HTTP client to be used by the crawler for making HTTP requests.

I

isAllowedContentType(String) - Static method in class org.jweaver.crawler.internal.util.URIHelper
Checks if a content type is allowed.
isAllowedUrl(String) - Static method in class org.jweaver.crawler.internal.util.URIHelper
Checks if a URL's extension is allowed.
isExternalUri(String, String) - Static method in class org.jweaver.crawler.internal.util.URIHelper
Checks if the child URI is external to the base URI.
isSuccess() - Method in record class org.jweaver.crawler.internal.result.ResponseData
Checks if the response indicates a successful request.
isValidUri(String) - Static method in class org.jweaver.crawler.internal.util.URIHelper
Check if the provided URI is valid

J

JSON - Enum constant in enum class org.jweaver.crawler.internal.write.ExportFileFormat
JSON file format.
JsonExportConfig - Record Class in org.jweaver.crawler.internal.write
The JsonExportConfig record represents the configuration for exporting data in JSON format.
JsonExportConfig(String, boolean) - Constructor for record class org.jweaver.crawler.internal.write.JsonExportConfig
Creates an instance of a JsonExportConfig record class.
JWeaverBuilderImpl - Class in org.jweaver.crawler.internal.runner
A concrete implementation of the JWeaverCrawler.Builder interface used to configure and build instances of JWeaverCrawler.
JWeaverBuilderImpl() - Constructor for class org.jweaver.crawler.internal.runner.JWeaverBuilderImpl
Constructs a new JWeaverBuilderImpl instance.
JWeaverCrawler - Interface in org.jweaver.crawler
Represents the JWeaverCrawler abstract class, which facilitates web crawling operations.
JWeaverCrawler.Builder - Interface in org.jweaver.crawler
The Builder interface provides methods for building and customize an instance of JWeaverCrawler.
JWeaverCrawlerImpl - Class in org.jweaver.crawler.internal.runner
A concrete implementation of JWeaverCrawler providing web crawling functionality.
JWeaverCrawlerImpl(JWeaverBuilderImpl) - Constructor for class org.jweaver.crawler.internal.runner.JWeaverCrawlerImpl
Constructs a new JWeaverCrawlerImpl instance.
JWeaverDocumentParser - Class in org.jweaver.crawler.internal.parse
The JWeaverDocumentParser class is responsible for parsing HTML documents to extract relevant information.
JWeaverDocumentParser() - Constructor for class org.jweaver.crawler.internal.parse.JWeaverDocumentParser
Constructs a new JWeaverDocumentParser instance.
JWeaverExecutionException - Exception Class in org.jweaver.crawler.internal.exception
The JWeaverExecutionException class represents an unchecked exception that occurs during the execution of a JWeaver task.
JWeaverExecutionException(String) - Constructor for exception class org.jweaver.crawler.internal.exception.JWeaverExecutionException
Constructs a new JWeaverExecutionException (RuntimeException) with the specified detail message.
JWeaverFileWriter - Class in org.jweaver.crawler.internal.write
A concrete implementation of the JWeaverWriter interface for writing data to files.
JWeaverTask - Class in org.jweaver.crawler.internal.runner
Handles the crawling process for a base URI *
JWeaverWriter - Interface in org.jweaver.crawler.internal.write
The JWeaverWriter interface defines methods for processing and writing the results of the web crawling process.

L

linkSet() - Method in record class org.jweaver.crawler.internal.result.SuccessResultPage
Returns the value of the linkSet record component.

M

MARKDOWN - Enum constant in enum class org.jweaver.crawler.internal.write.ExportFileFormat
Markdown file format.
MarkdownExportConfig - Record Class in org.jweaver.crawler.internal.write
The MarkdownExportConfig record represents the configuration for exporting data in Markdown format.
MarkdownExportConfig(String) - Constructor for record class org.jweaver.crawler.internal.write.MarkdownExportConfig
Creates an instance of a MarkdownExportConfig record class.
maxDepth(int) - Method in class org.jweaver.crawler.internal.runner.JWeaverBuilderImpl
 
maxDepth(int) - Method in interface org.jweaver.crawler.JWeaverCrawler.Builder
Sets the maximum depth of crawling .
metadata() - Method in record class org.jweaver.crawler.internal.result.SuccessResultPage
Returns the value of the metadata record component.
metadata() - Method in interface org.jweaver.crawler.internal.write.ExportConfig
Returns a boolean indicating whether metadata should be included in the export.
metadata() - Method in record class org.jweaver.crawler.internal.write.JsonExportConfig
Returns the value of the metadata record component.
metadata() - Method in record class org.jweaver.crawler.internal.write.MarkdownExportConfig
Specifies whether metadata should be included in the export.
Metadata - Record Class in org.jweaver.crawler.internal.result
The Metadata record represents metadata associated with a web page.
Metadata(String, int, String, int) - Constructor for record class org.jweaver.crawler.internal.result.Metadata
Creates an instance of a Metadata record class.
mkdir(File, boolean) - Static method in class org.jweaver.crawler.internal.util.FileUtils
Creates a directory if it does not exist.

N

NodeError - Record Class in org.jweaver.crawler.internal.result
The NodeError record represents an error associated with a specific node during web crawling.
NodeError(String, int, String) - Constructor for record class org.jweaver.crawler.internal.result.NodeError
Creates an instance of a NodeError record class.

O

org.jweaver.crawler - package org.jweaver.crawler
 
org.jweaver.crawler.internal.exception - package org.jweaver.crawler.internal.exception
 
org.jweaver.crawler.internal.parse - package org.jweaver.crawler.internal.parse
 
org.jweaver.crawler.internal.result - package org.jweaver.crawler.internal.result
 
org.jweaver.crawler.internal.runner - package org.jweaver.crawler.internal.runner
 
org.jweaver.crawler.internal.util - package org.jweaver.crawler.internal.util
 
org.jweaver.crawler.internal.write - package org.jweaver.crawler.internal.write
 
OutputFileException - Exception Class in org.jweaver.crawler.internal.exception
The OutputFileException class represents an unchecked exception that occurs when there is an issue with an output file or directory.
OutputFileException(Exception) - Constructor for exception class org.jweaver.crawler.internal.exception.OutputFileException
Constructs a new OutputFileException (RuntimeException) with the specified cause.

P

PageLink - Record Class in org.jweaver.crawler.internal.result
The PageLink record represents a link to a web page along with its depth in the crawling hierarchy.
PageLink(String, int) - Constructor for record class org.jweaver.crawler.internal.result.PageLink
Creates an instance of a PageLink record class.
parent() - Method in record class org.jweaver.crawler.internal.result.Connection
Returns the value of the parent record component.
parseBody(String, String) - Method in interface org.jweaver.crawler.internal.parse.DocumentParser
Parses the HTML body of a web page and extracts the main content body.
parseBody(String, String) - Method in class org.jweaver.crawler.internal.parse.JWeaverDocumentParser
 
parseLinks(String, String) - Method in interface org.jweaver.crawler.internal.parse.DocumentParser
Parses the HTML body of a web page and extracts the links contained within it.
parseLinks(String, String) - Method in class org.jweaver.crawler.internal.parse.JWeaverDocumentParser
 
parser(DocumentParser) - Method in class org.jweaver.crawler.internal.runner.JWeaverBuilderImpl
 
parser(DocumentParser) - Method in interface org.jweaver.crawler.JWeaverCrawler.Builder
Sets the document parser to be used by the crawler for parsing relevant information from HTML body.
parseTitle(String, String) - Method in interface org.jweaver.crawler.internal.parse.DocumentParser
Parses the HTML body of a web page and extracts the title.
parseTitle(String, String) - Method in class org.jweaver.crawler.internal.parse.JWeaverDocumentParser
 
path() - Method in interface org.jweaver.crawler.internal.write.ExportConfig
Returns the path where exported data will be saved.
path() - Method in record class org.jweaver.crawler.internal.write.JsonExportConfig
Returns the value of the path record component.
path() - Method in record class org.jweaver.crawler.internal.write.MarkdownExportConfig
Returns the value of the path record component.
politenessDelay(Duration) - Method in class org.jweaver.crawler.internal.runner.JWeaverBuilderImpl
 
politenessDelay(Duration) - Method in interface org.jweaver.crawler.JWeaverCrawler.Builder
Sets the politeness delay between consecutive requests made by the crawler to the same host
processConnectionMap(String, List<Connection>, ExportConfig) - Method in class org.jweaver.crawler.internal.write.JWeaverFileWriter
 
processConnectionMap(String, List<Connection>, ExportConfig) - Method in interface org.jweaver.crawler.internal.write.JWeaverWriter
Processes connection map information generated during crawling and writes it using the provided export configuration.
processErrors(String, List<NodeError>, ExportConfig) - Method in class org.jweaver.crawler.internal.write.JWeaverFileWriter
 
processErrors(String, List<NodeError>, ExportConfig) - Method in interface org.jweaver.crawler.internal.write.JWeaverWriter
Processes errors encountered during crawling and writes error information using the provided export configuration.
processSuccess(SuccessResultPage, ExportConfig) - Method in class org.jweaver.crawler.internal.write.JWeaverFileWriter
 
processSuccess(SuccessResultPage, ExportConfig) - Method in interface org.jweaver.crawler.internal.write.JWeaverWriter
Processes a successfully crawled page and writes the result using the provided export configuration.

R

requireNonEmpty(Collection<T>) - Static method in class org.jweaver.crawler.internal.util.BuilderValidator
Validates that the specified collection is not null.
requireNonEmpty(Collection<T>, String) - Static method in class org.jweaver.crawler.internal.util.BuilderValidator
Validates that the specified collection is not null or empty.
requireNonEmpty(T) - Static method in class org.jweaver.crawler.internal.util.BuilderValidator
Validates that the specified object is not null.
requireNonEmpty(T, String) - Static method in class org.jweaver.crawler.internal.util.BuilderValidator
Validates that the specified object is not null or empty.
ResponseData<T> - Record Class in org.jweaver.crawler.internal.result
The ResponseData record represents the response data received from a web request.
ResponseData(int, T) - Constructor for record class org.jweaver.crawler.internal.result.ResponseData
Creates an instance of a ResponseData record class.
ResultPage - Interface in org.jweaver.crawler.internal.result
The ResultPage interface represents a result page obtained during web crawling.
retrievedOn() - Method in record class org.jweaver.crawler.internal.result.Metadata
Returns the value of the retrievedOn record component.
run() - Method in class org.jweaver.crawler.internal.runner.JWeaverCrawlerImpl
 
run() - Method in interface org.jweaver.crawler.JWeaverCrawler
Runs the executions sequentially.
run(List<JWeaverTask>) - Method in interface org.jweaver.crawler.internal.runner.TaskExecutor
Executes the specified list of tasks sequentially.
run(List<JWeaverTask>) - Method in class org.jweaver.crawler.internal.runner.TaskExecutorImpl
 
RUNNER_THREAD_NAME - Static variable in class org.jweaver.crawler.internal.util.Constants
The prefix for the runner thread name.
runParallel() - Method in class org.jweaver.crawler.internal.runner.JWeaverCrawlerImpl
 
runParallel() - Method in interface org.jweaver.crawler.JWeaverCrawler
This should be the preferred choice to run the crawler.
runParallel(List<JWeaverTask>) - Method in interface org.jweaver.crawler.internal.runner.TaskExecutor
Executes the specified list of tasks in parallel.
runParallel(List<JWeaverTask>) - Method in class org.jweaver.crawler.internal.runner.TaskExecutorImpl
 

S

source() - Method in record class org.jweaver.crawler.internal.result.Metadata
Returns the value of the source record component.
statusCode() - Method in record class org.jweaver.crawler.internal.result.ResponseData
Returns the value of the statusCode record component.
SuccessResultPage - Record Class in org.jweaver.crawler.internal.result
The SuccessResultPage record represents a successful result page obtained during web crawling.
SuccessResultPage(String, String, String, Set<PageLink>, Metadata, int) - Constructor for record class org.jweaver.crawler.internal.result.SuccessResultPage
Creates an instance of a SuccessResultPage record class.

T

TaskExecutor - Interface in org.jweaver.crawler.internal.runner
The TaskExecutor interface defines methods for executing tasks either in parallel or sequentially.
TaskExecutorImpl - Class in org.jweaver.crawler.internal.runner
A concrete implementation of the TaskExecutor interface responsible for executing tasks.
title() - Method in record class org.jweaver.crawler.internal.result.SuccessResultPage
Returns the value of the title record component.
toString() - Method in record class org.jweaver.crawler.internal.result.Connection
Returns a string representation of this record class.
toString() - Method in record class org.jweaver.crawler.internal.result.ErrorResultPage
Returns a string representation of this record class.
toString() - Method in record class org.jweaver.crawler.internal.result.Metadata
Returns a string representation of this record class.
toString() - Method in record class org.jweaver.crawler.internal.result.NodeError
Returns a string representation of this record class.
toString() - Method in record class org.jweaver.crawler.internal.result.PageLink
Returns a string representation of this record class.
toString() - Method in record class org.jweaver.crawler.internal.result.ResponseData
Returns a string representation of this record class.
toString() - Method in record class org.jweaver.crawler.internal.result.SuccessResultPage
Returns a string representation of this record class.
toString() - Method in record class org.jweaver.crawler.internal.write.JsonExportConfig
Returns a string representation of this record class.
toString() - Method in record class org.jweaver.crawler.internal.write.MarkdownExportConfig
Returns a string representation of this record class.

U

uri() - Method in record class org.jweaver.crawler.internal.result.ErrorResultPage
Returns the value of the uri record component.
uri() - Method in record class org.jweaver.crawler.internal.result.NodeError
Returns the value of the uri record component.
uri() - Method in interface org.jweaver.crawler.internal.result.ResultPage
Returns the URI of the result page.
uri() - Method in record class org.jweaver.crawler.internal.result.SuccessResultPage
Returns the value of the uri record component.
URIHelper - Class in org.jweaver.crawler.internal.util
The URIHelper class provides utility methods for handling and validating URIs.
url() - Method in record class org.jweaver.crawler.internal.result.PageLink
Returns the value of the url record component.

V

valueOf(String) - Static method in enum class org.jweaver.crawler.internal.write.ExportFileFormat
Returns the enum constant of this class with the specified name.
values() - Static method in enum class org.jweaver.crawler.internal.write.ExportFileFormat
Returns an array containing the constants of this enum class, in the order they are declared.

W

writer(JWeaverWriter) - Method in class org.jweaver.crawler.internal.runner.JWeaverBuilderImpl
 
writer(JWeaverWriter) - Method in interface org.jweaver.crawler.JWeaverCrawler.Builder
Sets the writer for exporting the crawled data.
WRITER_THREAD_NAME - Static variable in class org.jweaver.crawler.internal.util.Constants
The prefix for the writer thread name.
WWW_STR - Static variable in class org.jweaver.crawler.internal.util.Constants
The string representation for 'www'.
B C D E F H I J L M N O P R S T U V W 
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form